Weekly GitHub Report for Tensorflow: January 01, 2025 - January 08, 2025
Weekly GitHub Report for Tensorflow
Thank you for subscribing to our weekly newsletter! Each week, we deliver a comprehensive summary of your GitHub project's latest activity right to your inbox, including an overview of your project's issues, pull requests, contributors, and commit activity.
Table of Contents
I. News
1.1 Recent Version Releases:
The current version of this repository is v2.18.0
1.2 Version Information:
The TensorFlow 2.18.0 release, created on October 21, 2024, introduces several key updates, including the addition of a fourth parameter to the TfLiteOperatorCreate
function for a cleaner API, the disabling of TensorRT support in CUDA builds, and the introduction of hermetic CUDA support for more reproducible builds. Notably, TensorFlow now supports NumPy 2.0 by default, with changes in type promotion rules, and continues to support NumPy 1.26 until 2025. Additionally, tf.lite
enhancements include support for TensorType_INT4
and TensorType_INT16
in various operations, and the LiteRT repository is now live, signaling upcoming changes in the TFLite development experience.
II. Issues
2.1 Top 5 Active Issues:
We consider active issues to be issues that that have been commented on most frequently within the last week. Bot comments are omitted.
-
Tensorflow not supported on Windows + ARM CPUs: This issue highlights a problem with TensorFlow not being supported on Windows systems with ARM CPUs, specifically when attempting to import TensorFlow on a Windows 11 machine with a Snapdragon processor. The user reports successful installation but encounters an ImportError related to DLL load failure, indicating a lack of support for the ARM architecture on Windows.
- The comments discuss the issue being initially marked as a duplicate of an older problem related to outdated Intel CPUs, but it is later clarified that the problem is different due to the ARM architecture. The conversation reveals that TensorFlow does not provide support for Windows on ARM CPUs, and suggestions are made to try using the Linux wheel via WSL or to use Google Colab as alternatives.
- Number of comments this week: 13
-
It doesn't support on python3.13: This issue highlights the lack of support for Python 3.13 in TensorFlow version 2.17, as users encounter errors when attempting to install it on macOS Sequoia ARM. The problem arises because TensorFlow's release cycle does not align with Python's, leading to a delay in support for new Python versions, which has been a recurring issue since Python 3.8.
- The comments discuss the historical delay in TensorFlow's support for new Python versions, with users expressing frustration over the lack of support for Python 3.13, especially since it is the default version in major distributions like Fedora 41. Some users suggest workarounds, such as downgrading Python, while others criticize the TensorFlow team's release process, calling for better synchronization with Python's release schedule. The discussion also touches on the complexity of TensorFlow's build system and dependencies, which contribute to the delay in supporting new Python versions.
- Number of comments this week: 7
-
Failing to convert MobileNetV3Large to TFLite w/ Integer q: This issue involves the failure to convert a MobileNetV3Large model to TensorFlow Lite (TFLite) with integer quantization, resulting in incorrect predictions on Windows 10 and conversion errors on Windows Subsystem for Linux (WSL). The user reports that the model produces unrelated outputs after conversion on Windows 10, and fails to convert entirely on WSL, with an LLVM error indicating a failure to infer result types.
- The comments discuss potential solutions, including downgrading TensorFlow to version 2.14.1, which reportedly resolves the issue. Another user suggests using the latest Keras version, which works but introduces a new problem where using a representative dataset worsens results. The discussion also touches on TensorFlow and Keras compatibility issues, with suggestions to try alternative approaches like using PyTorch, while acknowledging the need for further investigation into the representative dataset issue.
- Number of comments this week: 6
-
Mixing Keras Layers and TF modules.: This issue involves a user experiencing difficulties when mixing Keras layers with TensorFlow modules, specifically noting that
tf.Module
can tracetf.Variable
but not variables fromtf.keras
ortf.keras.Variable
. The user is seeking guidance on how to resolve tracing issues when using Keras layers withintf.Module
, as they are interested in using certain features not available in Keras alone, such as composite tensors.- The comments discuss the change in Keras 3.0, which no longer extends
tf.Module
due to its support for multiple backends, leading to issues with variable tracking. Users are advised to raise the issue in the Keras repository, and there is a discussion about the lack of common layer implementations in TensorFlow following this change, with a suggestion that users may need to create their own implementations using TensorFlow primitives. - Number of comments this week: 4
- The comments discuss the change in Keras 3.0, which no longer extends
-
Tensorflow BackupAndRestore method does not work: This issue is about a bug in TensorFlow's
BackupAndRestore
method, which fails to work because the model is not built before calling thefit()
function, resulting in aValueError
. The problem arises when using theBackupAndRestore
callback, which requires the model to be explicitly built beforehand, either by defining the input shape or by calling the model on a batch of data.- The comments discuss the requirement for the model to be built before using the
BackupAndRestore
method, offering solutions such as defining the input shape or usingmodel.build()
. A pull request addressing the issue has been merged, and users are asked to verify if the problem persists. Additional questions are raised about the compatibility of the solution with normalization layers, and a user reports encountering aValueError
when attempting one of the solutions. - Number of comments this week: 4
- The comments discuss the requirement for the model to be built before using the
2.2 Top 5 Stale Issues:
We consider stale issues to be issues that has had no activity within the last 30 days. The team should work together to get these issues resolved and closed as soon as possible.
As of our latest update, there are no stale issues for the project this week.
2.3 Open Issues
This section lists, groups, and then summarizes issues that were created within the last week in the repository.
Issues Opened This Week: 15
Summarized Issues:
- TensorFlow and Keras Integration Issues: The transition to Keras 3.0 has introduced challenges in integrating Keras layers within TensorFlow modules, as Keras no longer extends
tf.Module
. This change has led to difficulties in tracing variables fromtf.keras
layers, prompting users to seek alternative solutions. The community is encouraged to raise these issues with the Keras team to find a resolution.
- TensorFlow Backup and Restore Bug: A bug in TensorFlow's
BackupAndRestore
method causes aValueError
when the model is not built before callingfit()
. Users must explicitly define the input shape or use themodel.build()
method to ensure the callback functions correctly. This issue highlights the need for clear documentation on model preparation steps.
- TFLITE NMS Kernel Inconsistency: The TFLITE NMS kernel produces outputs inconsistent with TensorFlow NMS, appending zeros to the "selected_indices" output. This discrepancy leads to inefficient computation and potential out-of-memory errors on Android devices. A fix is requested to ensure identical outputs between TFLITE and TensorFlow NMS.
- TensorFlow Source Code Compatibility Issues: Users face compatibility issues with TensorFlow 2.4.1, where the
label_image.py
test case fails due to missing attributes and files. Importing fromcompat.v1
is necessary for backward compatibility, but additional errors persist. These issues highlight the challenges of maintaining compatibility across TensorFlow versions.
- TensorFlow C++ Interface Compilation Errors: Compiling the TensorFlow C++ interface using Bazel on Linux Ubuntu 22.04 presents challenges, particularly with downloading necessary Python repository files. Connection timeouts occur despite following version matching and configuration procedures. This issue underscores the complexities of setting up TensorFlow's C++ interface.
- TensorFlow Lite GPU Delegation Challenges: Implementing GPU delegation for TensorFlow Lite on Android results in significant lag and non-functionality. Users seek guidance on optimizing code to effectively utilize GPU resources without causing slowdowns or crashes. This issue highlights the need for better support and documentation for GPU delegation in TensorFlow Lite.
- TensorFlow ARM Architecture Support Issues: TensorFlow version 2.18 cannot be imported on Windows 11 with an ARM-based Snapdragon X Plus CPU due to lack of support for Windows + ARM architecture. This results in a DLL load failure error, prompting discussions on alternative solutions like using WSL or Colab. The issue emphasizes the need for broader architecture support in TensorFlow.
- TensorFlow Local Execution Errors: A beginner encounters a KeyError while running TensorFlow locally on Anaconda in VS Code due to a missing file during dataset download. The same code runs successfully on Google Colab, indicating potential local environment setup issues. This highlights the importance of ensuring consistent environments for TensorFlow execution.
- TensorFlow CUDA Configuration Requests: Users seek guidance on configuring TensorFlow 2.18 to build using local CUDA libraries instead of the hermetic CUDA. The current setup introduces unwanted dependencies, and users have previously built TensorFlow versions with local CUDA setups. This issue reflects the need for flexible configuration options in TensorFlow builds.
- TensorFlow Metal Compatibility Errors: Upgrading to TensorFlow version 2.18 on MacOS 15.2 with an Apple M2 Max GPU results in a "Symbol not found" error related to libmetal_plugin.dylib. This error does not occur with TensorFlow version 2.17, indicating a compatibility issue with the newer version. The issue highlights the challenges of maintaining compatibility with Apple's hardware.
- TensorFlow Lite MFCC Model Conversion Errors: A model for calculating MFCC converted from TensorFlow to TensorFlow Lite fails with a "IsPowerOfTwo-RuntimeError" during the rfft2d operation. The user believes all STFT function arguments are powers of two, raising questions about potential user errors or bugs. This issue underscores the complexities of model conversion in TensorFlow Lite.
- TensorFlow Lite Custom Operation Errors: Users encounter errors related to unresolved custom operations, such as XlaDynamicSlice, when converting models from Huggingface to TensorFlow Lite for Android. System information, logs, and a Colab link are provided to reproduce the problem. This issue highlights the need for better support for custom operations in TensorFlow Lite.
- TensorFlow XLA Compiler Bug: A bug in TensorFlow's XLA compiler prevents the compilation of the
tf.keras.layers.Conv2D
layer withpadding='valid'
. The operation succeeds in eager execution mode but fails with a negative dimension size error during compilation. This issue highlights the need for robust compiler support in TensorFlow.
- TensorFlow Model Saving Bug: In TensorFlow version 2.19.0-dev20250105, saving a Keras model with
include_optimizer=False
does not work as expected. The optimizer is still included in the saved model, indicating a bug in the model saving process. This issue emphasizes the importance of reliable model saving functionality in TensorFlow.
- TensorFlow Model Training Data Ordering Bug: A bug in the TensorFlow model's
fit
method causes incorrect data ordering when using dictionaries to load data. This results in model malfunction when trained with multiple inputs and outputs. The issue is demonstrated in provided Google Colab and GitHub Gist links, highlighting the need for accurate data handling in TensorFlow.
2.4 Closed Issues
This section lists, groups, and then summarizes issues that were closed within the last week in the repository. This section also links the associated pull requests if applicable.
Issues Closed This Week: 19
Summarized Issues:
- Integer Overflow in TensorFlow Operations: An integer overflow error occurs in the
tf.raw_ops.DenseBincount
operation of TensorFlow 2.15 on Linux Ubuntu 20.04. This issue arises when thesize
parameter is set near the maximum value for the data type, leading to incorrect multiplication results. Consequently, the operation is aborted, causing disruptions in the workflow.
- TensorFlow Model Saving and Execution Errors: Users encounter errors when saving and executing TensorFlow models across different versions and environments. In TensorFlow 2.17.0, saving a
tf.Module
with a Keras model results in aFAILED_PRECONDITION
error, unlike in version 2.15.0. Additionally, discrepancies in function recognition occur in PyCharm with TensorFlow 2.18.0, suggesting direct Keras imports as a workaround.
- Convergence and Training Issues in TensorFlow Models: Users report problems with model convergence and unexpected training behavior in TensorFlow. An Actor-Critic algorithm fails to converge in TensorFlow 1.x, prompting a migration to 2.x. A custom Siamese Network outputs the same class for every instance when using
tf.data.Dataset
, raising questions about implementation or internal issues.
- Import and Runtime Errors on Windows Systems: Import and runtime errors are prevalent on Windows systems due to DLL load failures. These issues are often linked to outdated CPU architecture or missing MSVC 2019 redistributable, affecting TensorFlow versions 2.18 and earlier. Users face similar problems across different Windows versions, indicating a need for specific software installations.
- TensorFlow Docker and GPU Support Issues: The Docker image
tensorflow:2.18-gpu-jupyter
lacks GPU support due to missing CUDA libraries. Users are required to manually installtensorflow[and-cuda]
to enable GPU functionality. This contradicts the expectations set by the image's documentation, causing inconvenience for users relying on GPU capabilities.
- Gradient and Interpretability Method Anomalies: Users experience unexpected positive gradients for negative responses using gradient-based interpretability methods with pre-trained models. The gradients consistently highlight the same filter across different images, sometimes indicating relevance to filters with null outputs. This raises confusion about potential errors in code or gradient computations.
- TensorFlow on Apple M3 Max GPU Crashes: TensorFlow 2.16.2 crashes when attempting to use the GPU on an Apple M3 Max. The issue is related to memory allocation problems with TensorFlow's Metal backend. The error message "pointer being freed was not allocated" suggests a need for further investigation into the Metal backend's compatibility.
- TensorFlow Lite Model Conversion and Execution Discrepancies: Users face issues with TensorFlow Lite model conversion and execution across platforms. A discrepancy in named entity recognition results between Python and Android is resolved by correcting input tensor order. Additionally, an error occurs during model conversion, indicating a type mismatch in the conversion process.
- Miscellaneous TensorFlow Issues and Proposals: Various issues and proposals are raised, including renaming "TensorFlow" for clarity, updating the CODE_OF_CONDUCT, and addressing deprecated C++ features. These issues reflect ongoing efforts to improve TensorFlow's usability and maintainability.
- Annoying Warning Messages in TensorFlow: Users report an annoying warning message, "Ignoring Assert operator," during
model.fit()
operations in TensorFlow 2.17+ with Keras 3.x. Attempts to suppress the warning using environment variables are unsuccessful. Switching to tf-nightly resolves the issue, but users seek clarification on the warning's origin and significance.
- General Frustration with TensorFlow Design: A user expresses frustration with TensorFlow 2.16.1 on Linux Ubuntu 22.04, describing the framework as poorly designed. The report provides minimal information or context about the specific problem encountered. This highlights the need for improved user experience and documentation.
2.5 Issue Discussion Insights
This section will analyze the tone and sentiment of discussions within this project's open and closed issues that occurred within the past week. It aims to identify potentially heated exchanges and to maintain a constructive project environment.
- Is this a human designed framework?
- Toxicity Score: 0.55 (Negative language, Criticism, Potential for escalation.)
- This GitHub conversation begins with a user expressing dissatisfaction with a software framework, using strong negative language. Another user, DeepLearningfeng, responds by requesting more detailed information about the issue and provides resources for troubleshooting, maintaining a neutral and helpful tone. A third user interjects with a brief comment advising against spamming, which could be interpreted as a warning or criticism. The initial negative sentiment and the subsequent admonition suggest underlying tension.
III. Pull Requests
3.1 Open Pull Requests
This section lists and summarizes pull requests that were created within the last week in the repository.
Pull Requests Opened This Week: 0
As of our latest update, there are no open pull requests for the project this week.
3.2 Closed Pull Requests
This section lists and summarizes pull requests that were closed within the last week in the repository. Similar pull requests are grouped, and associated commits are linked if applicable.
Pull Requests Closed This Week: 10
Key Closed Pull Requests
1. Fix checkfail in ResourceSparseApplyKerasMomentum: This pull request addresses a check failure in the ResourceSparseApplyKerasMomentum
operation by proposing a validation check to ensure that the var
or accum
arguments are of dtype float32, potentially resolving issue #63720.
- Merged: No
- Associated Commits: 79c14a428d8255733699b423dd58f3fcceb46ca4
2. Unregister complex dtypes for Round OP: This pull request addresses an issue with the Round operation in TensorFlow by unregistering support for complex data types and certain integer types (int8 and int16), aligning the operation's registration to only support float, int32, and int64 types as per the source code, and potentially resolving issue #65317.
- Merged: No
- Associated Commits: 822db8ac37439d258ef4e0ad3364667e96f97f44, b8b56d2422cded257d37e86b7242534fd94f6468
3. [NFC][ROCM] Replaced DoMatmul with ExecuteOnStream call for gpu_blas_lt: This pull request refactors the ROCM platform's gpu_blas_lt interface in TensorFlow by replacing the DoMatmul function with the more robust ExecuteOnStream call, which automatically handles valid data type combinations, thereby simplifying the interface and improving support for hipblas-lt.
- Merged: Yes
- Associated Commits: e20ec8d81ee4084cd562e4df35c0cfccfaf8417e, fe9524daec1ab517e6730bf755ff6173a4da1dc9
Other Closed Pull Requests
- Real-Time Optimization in TensorFlow Grappler: The pull request introduces a Real-Time Optimization feature to TensorFlow's Grappler optimizers, enhancing model performance by dynamically adjusting graph optimizations based on runtime metrics. It integrates a new
RealTimeOptimizer
class with the existing meta optimizer framework, adding strategies triggered by real-time performance metrics like memory usage and computation time. Comprehensive unit tests ensure the new optimizer's functionality and stability within the TensorFlow framework.
- Symbol Export Fix for Windows: This pull request resolves an issue with compiling TensorFlow Lite on Windows platforms by correcting symbol exports in shared libraries. The problem was due to the
PUBLIC
keyword in theCMakeLists.txt
file, affecting compiler definitions for both static and shared libraries. The solution involves modifying the keyword to prevent inheritance of definitions, ensuring proper symbol exportation intensorflowlite_c.dll
.
- Documentation Improvements for TensorFlow Eager Monitoring: Multiple pull requests focus on enhancing the documentation for the TensorFlow Eager Monitoring Counter API bindings. These updates provide detailed information on functions for creating a new counter reader and reading counter values in Python, both with and without labels. The improved documentation aims to aid in TensorFlow performance analysis and issue diagnostics, supporting developers in effectively utilizing the API.
- Fixing Broken Links in Documentation: This pull request addresses the issue of eight broken links in the
best_practices.md
file within the TensorFlow Lite performance documentation. The author updated these links to ensure they are functional, improving the accessibility and usability of the documentation. The changes have been reviewed and successfully merged into the main codebase.
- Update to .bazelignore File: A pull request was made to update the
.bazelignore
file in the TensorFlow repository. Although the commit message is brief, it indicates changes were made to this file, which is used to specify files and directories that Bazel should ignore. The pull request is currently open and has not yet been merged.
3.3 Pull Request Discussion Insights
This section will analyze the tone and sentiment of discussions within this project's open and closed pull requests that occurred within the past week. It aims to identify potentially heated exchanges and to maintain a constructive project environment.
Based on our analysis, there are no instances of toxic discussions in the project's open pull requests from the past week.
IV. Contributors
4.1 Contributors
Active Contributors:
We consider an active contributor in this project to be any contributor who has made at least 1 commit, opened at least 1 issue, created at least 1 pull request, or made more than 2 comments in the last month.
If there are more than 10 active contributors, the list is truncated to the top 10 based on contribution metrics for better clarity.
Contributor | Commits | Pull Requests | Issues | Comments |
---|---|---|---|---|
mihaimaruseac | 1 | 0 | 0 | 63 |
Venkat6871 | 1 | 1 | 0 | 47 |
gaikwadrahul8 | 5 | 5 | 0 | 33 |
tilakrayal | 0 | 0 | 0 | 26 |
alekstheod | 13 | 1 | 0 | 0 |
dnmaster1 | 0 | 0 | 2 | 9 |
pkgoogle | 0 | 0 | 0 | 10 |
mraunak | 0 | 0 | 0 | 9 |
muayyad-alsadi | 0 | 0 | 0 | 9 |
NexusHex | 0 | 0 | 0 | 8 |
Access last week's newsletter: https://buttondown.com/weekly-project-news/archive/weekly-github-report-for-tensorflow-january-01/