Weekly GitHub Report for Tensorflow - 2024-11-25 12:00:52
Weekly GitHub Report for Tensorflow
Thank you for subscribing to our weekly newsletter! Each week, we deliver a comprehensive summary of your GitHub project's latest activity right to your inbox, including an overview of your project's issues, pull requests, contributors, and commit activity.
Table of Contents
I. News
1.1 Recent Version Releases:
The current version of this repository is v2.18.0
1.2 Other Noteworthy Updates:
II. Issues
2.1 Top 5 Active Issues:
We consider active issues to be issues that that have been commented on most frequently within the last week.
- Tensorflow Import Not working: This issue involves a user experiencing difficulties importing TensorFlow on a Windows 11 system, despite following various troubleshooting steps such as reinstalling Python and TensorFlow, and trying different Python versions. The user consistently encounters a DLL load failure error, indicating a problem with the dynamic link library initialization when attempting to import TensorFlow.
- The comment section reveals a series of troubleshooting attempts, including suggestions to use different TensorFlow versions, create new virtual environments, and reinstall packages. Despite these efforts, the user continues to face the same import error, leading to discussions about potential machine-specific issues, such as processor compatibility. The conversation also involves sharing environment details and logs to facilitate further investigation, with the user eventually providing a zip file of their virtual environment for more in-depth analysis by the support team.
- Number of comments this week: None
- GPU MaxPool gradient ops do not yet have a deterministic XLA implementation: This issue is about the lack of a deterministic XLA implementation for GPU MaxPool gradient operations in TensorFlow, which causes runtime exceptions when attempting to ensure deterministic behavior during model training. The problem persists even with the latest TensorFlow and Keras versions, and users are seeking a solution to achieve reproducible results without encountering errors.
- The comments discuss the issue of deterministic operations in TensorFlow, particularly with MaxPooling layers, and users share their experiences and attempts to resolve the problem. Some users suggest disabling XLA by setting
jit_compile=False
in the model's compile method, which resolves the error for some. Others report encountering additional issues when trying to implement this solution, such as errors related to symbolic tensors. The conversation includes requests for updates on the issue and suggestions for potential workarounds.- Number of comments this week: None
-
JIT compliation failed: This issue involves a bug where TensorFlow code runs successfully on a CPU but fails when executed on a GPU, specifically with a Radeon 7900XT, due to a JIT compilation error. The user is using TensorFlow version 2.16.1 on a Linux Ubuntu 24.04 system with custom code, and the problem seems to be related to the ROCm version compatibility with TensorFlow.
- The comments discuss troubleshooting steps, including attempts to replicate the issue on Colab, which did not reproduce the error, and suggestions to try different TensorFlow versions. The user identifies that the problem occurs when using the "Run File" button in PyCharm but not when running the code via the terminal, indicating a potential issue with PyCharm's run configuration. The user also mentions that their ROCm version is only compatible with specific TensorFlow versions, and another user reports experiencing a similar issue with a similar setup.
- Number of comments this week: None
-
TensorFlow Stable Delegate Python API: This issue is about the lack of support for running stable delegates using the TensorFlow Python API, with the user inquiring about possible workarounds or future plans for support. The user has provided a code snippet demonstrating the current behavior and is seeking guidance on whether the feature will be implemented in the future.
- The comments reveal that the TensorFlow Python API does not currently support stable delegates, although the C++ API does. The stable delegate API is no longer experimental but is only supported in C, C++, and Java APIs. There is a suggestion that adding support for the Python API would be beneficial, and a contributor has expressed interest in working on this feature. The discussion includes a call for collaboration and hints that a patch could facilitate the process.
- Number of comments this week: None
-
tflite int8 export is twice as large as saved_model.pb: This issue is about a discrepancy in file size when exporting a TensorFlow Lite model in int8 format, which results in a file that is twice as large as the original saved_model.pb. The user provides system information, code snippets, and a screenshot to illustrate the problem and seeks assistance in resolving the unexpected increase in file size.
- The comments involve a request for additional resources to replicate the issue, including the saved model and a Google Colab notebook. The user responds by sharing links to the saved model and the PyTorch file used to create it, along with code for exporting the model. There is a follow-up request for access to the shared files, which the user then grants.
- Number of comments this week: None
2.2 Top 5 Stale Issues:
We consider stale issues to be issues that has had no activity within the last 30 days. The team should work together to get these issues resolved and closed as soon as possible.
As of our latest update, there are no stale issues for the project this week.
2.3 Open Issues
This section lists, groups, and then summarizes issues that were created within the last week in the repository.
Issues Opened This Week: 28
Summarized Issues:
- Installation and Compatibility Issues: Users are facing various installation and compatibility issues with TensorFlow, particularly on Windows and specific hardware setups. One user is unable to install the TensorFlow GPU version on Windows 11 Home, while another encounters an ImportError in Jupyter Lab due to potential compatibility issues. Additionally, there are challenges with TensorFlow not detecting a GPU in a Docker environment on Ubuntu 20.04, likely due to CUDA driver mismatches.
- Compilation and Build Failures: Several issues involve compilation and build failures with TensorFlow, often related to specific configurations or environments. A user reports a build failure when compiling TensorFlow with CUDA support on a Jetson Orin Nano, while another faces a problem with TensorFlow Lite library compilation due to incorrect regular expressions in CMakeLists.txt. Additionally, there are difficulties in building and preloading the
tfrt_session
due to missing symbols.
- Bugs in TensorFlow Operations: Multiple bugs have been reported in TensorFlow operations, particularly in version 2.17, causing crashes with "Aborted (core dumped)" errors. These issues occur with operations like
tf.raw_ops.Cholesky
,tf.raw_ops.MatrixDeterminant
, andtf.raw_ops.MatrixInverse
when executed with empty input shapes on systems with available GPUs. The problems are demonstrated with standalone code and log outputs.- issues/tensorflow/tensorflow/issues/80312
- issues/tensorflow/tensorflow/issues/80315
- issues/tensorflow/tensorflow/issues/80316
- issues/tensorflow/tensorflow/issues/80331
- issues/tensorflow/tensorflow/issues/80332
- issues/tensorflow/tensorflow/issues/80334
- issues/tensorflow/tensorflow/issues/80528
- issues/tensorflow/tensorflow/issues/80529
- Performance and Efficiency Concerns: Users have raised concerns about performance inefficiencies in TensorFlow, particularly on TPUs. The
tf.gather
function is reported to be significantly slower on TPUs compared to GPUs during model training, prompting users to seek solutions to enhance performance. Additionally, there are questions about the necessity ofnum_parallel_calls
in thetf.data
API for optimization.
- Documentation and Usability Issues: Several issues highlight the need for improved documentation and usability in TensorFlow. Users have pointed out discrepancies in the TensorFlow Lite Interpreter's documentation regarding
num_threads
settings and the lack of explanation fortf.py_function()
in thetf.data
API guide. Additionally, there are suggestions for refactoring large functions in Keras model code to improve readability.
- Model and Data Handling Issues: Users are experiencing issues related to model and data handling in TensorFlow. One user reports a bug in the
model.fit
function when training on extremely large NumPy arrays, while another encounters a discrepancy in the size of an exported TensorFlow Lite model. Additionally, there are concerns about thetfl.quantize
operation's support for QI4 data types.
- Error Handling and Debugging: Users are encountering various errors and seeking guidance on debugging and error handling in TensorFlow. Issues include a "Could not find variable" error when loading a SavedModelBundle in Java, and a floating point exception in the
tf.raw_ops.Reshape
operation. There are also challenges with resolving PATH environment variable conflicts during installation.
- Miscellaneous Issues: Other issues include a PEP8 violation in TensorFlow's
training.py
file, a compiler warning in TensorFlow Lite'sgraph_info.h
file, and a problem with blank waiting times in TensorBoard's Trace Viewer on ARM machines. These issues highlight the need for attention to coding standards, compiler warnings, and performance analysis tools.
2.4 Closed Issues
This section lists, groups, and then summarizes issues that were closed within the last week in the repository. This section also links the associated pull requests if applicable.
Issues Closed This Week: 17
Summarized Issues:
- Apple Silicon GPU Support Issues: Users have reported issues with utilizing Apple Silicon GPU support in TensorFlow version 2.16.1 on MacOS 14.3.1, which was functional in version 2.15.0. A temporary workaround involves switching from the default Keras3 backend to Keras2 to restore GPU functionality. This highlights the need for consistent support across TensorFlow versions for Apple Silicon users.
- TensorFlow Model Compatibility and Errors: Several issues have been reported regarding TensorFlow model compatibility and errors across different versions. Users have encountered errors such as "No OpKernel was registered to support Op 'BoostedTreesBucketize'" due to deprecated operations, and runtime errors like 'CUDA_ERROR_INVALID_HANDLE' on Linux platforms. These issues suggest a need for better backward compatibility and error handling in TensorFlow updates.
- TensorFlow Lite and Quantization Concerns: Users have raised concerns about TensorFlow Lite's support for int8 quantization and the precision loss associated with post-training quantization. There are inquiries about quantizing specific layers and significant accuracy loss in internal state models when using int8 inference. These issues highlight the need for improved quantization support and documentation in TensorFlow Lite.
- TensorFlow Installation and Dependency Issues: Users have experienced difficulties with TensorFlow installation and dependency management, such as rollback issues with tensorflow-text and errors during source builds on Linux systems. These problems emphasize the importance of clear documentation and compatibility checks for TensorFlow and its dependencies.
- TensorFlow Runtime and Execution Errors: There are reports of runtime and execution errors in TensorFlow, including DLL initialization errors and unexpected tensor operation assignments on GPUs. These issues suggest a need for improved error diagnostics and handling in TensorFlow's runtime environment.
- TensorFlow Feature Requests and Improvements: Users have requested features such as warnings for mixed data types in model layers and better linking of the Flex library on Android. These requests indicate areas where TensorFlow can enhance user experience through improved features and documentation.
- TensorFlow Web and CORS Policy Issues: Integration of TensorFlow models with web applications has been hindered by CORS policy errors, blocking access to model files. Discussions include potential solutions like hosting models on personal servers, highlighting the need for better web integration support in TensorFlow.
- TensorFlow GPU Utilization Challenges: Users have faced challenges in utilizing GPUs for TensorFlow model inference, particularly with quantized models and compatibility issues with Bazel and TensorFlow versions. These challenges underscore the need for streamlined GPU support and compatibility across different platforms and configurations.
- TensorFlow Logging and Warning Suppression: There are issues with TensorFlow version 2.17.0 where runtime warning messages persist despite attempts to suppress them. This suggests that certain TensorFlow modules might override logging configurations, indicating a need for more robust logging control mechanisms.
- TensorFlow Library and Module Errors: Users have encountered errors such as
ModuleNotFoundError
for libraries likeh5py
during TensorFlow operations. These errors highlight the importance of ensuring proper library installations and compatibility checks in TensorFlow environments.