Weekly GitHub Report for Tensorflow - 2024-12-16 12:00:01
Weekly GitHub Report for Tensorflow
Thank you for subscribing to our weekly newsletter! Each week, we deliver a comprehensive summary of your GitHub project's latest activity right to your inbox, including an overview of your project's issues, pull requests, contributors, and commit activity.
Table of Contents
I. News
1.1 Recent Version Releases:
The current version of this repository is v2.18.0
1.2 Other Noteworthy Updates:
II. Issues
2.1 Top 5 Active Issues:
We consider active issues to be issues that that have been commented on most frequently within the last week.
-
Division by zero error at random places if GPU is used: This issue involves a division by zero error occurring randomly when a program using TensorFlow's C API is executed on a GPU, specifically a Quadro RTX 6000, while it runs without issues on a CPU. The problem is challenging to diagnose as it appears to be related to deep interactions between TensorFlow and CUDA libraries, with inconsistent behavior even when the program is run multiple times under the same conditions.
- The comments discuss potential collaboration to resolve the issue, with one user offering to contribute and suggesting a screen-sharing session to better understand the problem. Another user suggests that the issue might be due to incompatible software versions and recommends updating them according to TensorFlow's documentation. There is also a discussion about scheduling a meeting to further investigate the issue, and an attempt to update CUDA libraries, which led to additional compatibility issues.
- Number of comments this week: None
-
InvalidArgumentError when using MirroredStrategy but not with tf.distribute.get_strategy(): This issue involves an InvalidArgumentError encountered when using TensorFlow's MirroredStrategy, which does not occur when using tf.distribute.get_strategy(). The error arises specifically when deploying the code on multiple GPUs, and the user is seeking a solution to ensure the code runs without errors under MirroredStrategy, similar to its behavior with tf.distribute.get_strategy().
- The comments reveal that the issue is reproducible with specific versions of TensorFlow and Keras, and a temporary workaround involving downgrading to older versions of these libraries resolves the problem. The user confirms the workaround's effectiveness and inquires about a permanent fix in future releases, to which the support team responds affirmatively, indicating plans to address the issue in upcoming updates.
- Number of comments this week: None
-
error: defining a type within 'offsetof' is a Clang extension [-Werror,-Wgnu-offsetof-extensions]: This issue involves a user encountering a build error while attempting to compile the Selective Framework for iOS using TensorFlow Lite, specifically related to a Clang extension warning about defining a type within 'offsetof'. The error occurs during the compilation of the 'upb.c' file, and the user is seeking assistance to resolve this problem as they are new to the process.
- The comments section includes a request for more detailed information about the user's environment and the exact steps they followed, including TensorFlow version and environment details. The user provides some additional steps they took, and another commenter suggests trying a specific Bazel flag to bypass the warning. The conversation reflects ongoing troubleshooting efforts, with apologies for delayed responses and requests for further details to help replicate and resolve the issue.
- Number of comments this week: None
-
The warning "The structure of
inputs
doesn't match the expected structure" when training a functional model: This issue involves a warning message encountered when training a functional model in TensorFlow, indicating a mismatch between the structure of the inputs and the expected structure, which raises concerns about the model's functionality. The user has identified that the warning persists regardless of the data format used and has traced the issue to a specific line in the Keras library, seeking clarification and resolution.- The comments discuss the cause of the warning, which is due to a mismatch between the input structure and TensorFlow's expectations, and provide a code solution to avoid the warning by ensuring inputs are correctly structured. The conversation also highlights that using tuples instead of lists can resolve the issue, but saving and reloading the model reintroduces the warning. A user confirms that the suggested fix works, while another user reports not encountering the warning in a different environment, prompting further discussion about environmental differences.
- Number of comments this week: None
-
Tflite x86 lib and dll for windows: This issue is about a user attempting to build an x86 library for TensorFlow Lite on Windows, but encountering difficulties and requesting assistance in obtaining the necessary x86 library and DLL files. The user has tried following existing guides and suggestions but has not been successful, and is seeking further guidance or a direct provision of the files.
- The comments section involves a request for more details on the steps and environment used by the user, followed by suggestions to use official documentation and specific CMake commands. The user reports build failures and requests the library and DLL files directly. A responder notes that Bazel does not support x86 builds on Windows and suggests using CMake, referencing a similar issue where a user succeeded. They mention the lack of official x86 builds and recommend building from source.
- Number of comments this week: None
2.2 Top 5 Stale Issues:
We consider stale issues to be issues that has had no activity within the last 30 days. The team should work together to get these issues resolved and closed as soon as possible.
As of our latest update, there are no stale issues for the project this week.
2.3 Open Issues
This section lists, groups, and then summarizes issues that were created within the last week in the repository.
Issues Opened This Week: 13
Summarized Issues:
- Linking and Compilation Errors: Issues related to linking and compilation errors are prevalent, affecting various platforms and configurations. One issue involves undefined symbol errors when linking an Android static library with TensorFlow Lite GPU using CMake on Ubuntu, hindering proper installation. Another issue highlights a compile error on Windows 11 due to a template declaration conflict in the LLVM project. Additionally, a compilation error occurs with ndk-build for a custom C++ file, indicating a problem with linking the NNAPI delegate in TensorFlow Lite.
- Python and Docker Compatibility: Ensuring compatibility with updated software versions is crucial for TensorFlow's functionality. An issue was raised to update the Python version in TensorFlow's Docker images from an outdated release to a stable one, aiming to enhance compatibility. Another issue points out an outdated link for NVIDIA Docker support on the TensorFlow Docker installation page, which should be updated to guide users correctly.
- Runtime and Execution Errors: Various runtime and execution errors have been reported, affecting TensorFlow's usability. A
ValueError
occurs in TensorFlow 2.18.0 when using a dataset with an unknown shape in themodel.evaluate
function, which was not present in earlier versions. A division by zero error is reported when using a GPU with TensorFlow's C API, suggesting potential incompatibilities with CUDA libraries. Additionally, an ImportError is encountered in PyCharm due to a DLL initialization error when running TensorFlow.
- Documentation and Warning Messages: Documentation inaccuracies and disruptive warning messages have been identified in TensorFlow. The
tf.raw_ops.MaxPoolGradWithArgmax
function documentation incorrectly states the supported data types for theargmax
parameter, leading to user confusion. Additionally, a persistent warning message during themodel.fit()
operation disrupts the progress bar display, despite attempts to suppress it.
- ImageDataGenerator Parameter Swap: A bug in TensorFlow 2.18.0 affects the
ImageDataGenerator
parameters, causing unexpected behavior. The 'height_shift_range' and 'width_shift_range' parameters are swapped, leading to incorrect image transformations. This issue disrupts the intended functionality of image augmentation processes.
- Cross-Compile Toolchain Compatibility: Compatibility issues with the cross-compile toolchain for TensorFlow Lite on Raspberry Pi Zero have been reported. The provided toolchain is designed for armv7 architecture, while the Raspberry Pi Zero requires armv6, resulting in build failures. This mismatch leads to illegal instruction errors during the build process.
- TensorFlowLite Version Publication: There is a request for the publication of TensorFlowLite version 2.18.0 to Maven Central and CocoaPods Specs. This publication is necessary for Android and iOS platforms before transitioning to LiteRT, ensuring developers have access to the latest version.
2.4 Closed Issues
This section lists, groups, and then summarizes issues that were closed within the last week in the repository. This section also links the associated pull requests if applicable.
Issues Closed This Week: 25
Summarized Issues:
- TensorFlow 2.17 GPU Crashes: Several issues have been reported with TensorFlow 2.17 where operations like
Cholesky
,MatrixDeterminant
,LogMatrixDeterminant
,RaggedTensorToVariantGradient
, andTridiagonalSolve
cause crashes with "Aborted (core dumped)" errors on GPUs. These problems occur when executed with empty input shapes or specific inputs, and they persist even in TensorFlow Nightly versions. Users have provided code snippets and error logs to demonstrate these crashes, highlighting the need for fixes in future updates.
- TensorFlow 2.17 Ubuntu Crashes: TensorFlow 2.17 and 2.16.1 on Ubuntu 20.04 have been reported to crash with operations like
SparseTensorDenseMatMul
,ParameterizedTruncatedNormal
, andUnBatch
under specific conditions. These crashes are often due to invalid input shape dimensions or specific input conditions, leading to "Aborted (core dumped)" errors. While some issues are resolved in TensorFlow 2.18.0, others persist, requiring further investigation.
- TensorFlow Lite Compatibility Issues: Users have encountered compatibility issues with TensorFlow Lite, such as the absence of certain versions on Maven and the incompatibility of operations like
tf.where
with TFLite Micro. These issues hinder developers from implementing specific versions in Android projects and converting models for TFLite Micro. Community input and project redirection are sought to address these challenges.
- Build and Compilation Errors: Several build and compilation errors have been reported, including a Gradle project build failure due to invalid parameter usage, a compiler warning in TensorFlow Lite's
graph_info.h
, and a Microsoft Visual C++ error related to STL changes. These issues require codebase patches or adjustments to resolve compatibility problems with different development environments.
- TensorFlow Performance and Precision: Users have reported discrepancies in numerical precision and unexpected GPU performance with TensorFlow. The
tf.math.cumsum
operation shows varying precision across platforms, while newer GPUs like the RTX 4060Ti do not consistently outperform older models like the GTX 1660Ti. These findings suggest the need for careful consideration of data types and hardware compatibility in TensorFlow applications.
- TensorFlow 2.14 and 2.15 Issues: TensorFlow 2.14 has a bug where zipped datasets with Tensors and RaggedTensors cause a
TypeError
in the.fit()
method, while TensorFlowLiteC 2.15.0 significantly increases app size on iOS. These issues highlight the challenges of input type compatibility and efficient app deployment, prompting users to seek solutions without custom training loops or excessive app size.
- Miscellaneous Issues: Other issues include a request for limiting GPU memory usage in TensorFlow Lite, a PEP8 violation in
training.py
, a heap-buffer-overflow bug inConjugateTranspose
, and a spam entry on GitHub. These diverse issues range from feature requests and coding standards to security concerns, reflecting the wide array of challenges faced by the TensorFlow community.