Weekly GitHub Report for Tensorflow: December 07, 2024 - December 14, 2024
Weekly GitHub Report for Tensorflow
Thank you for subscribing to our weekly newsletter! Each week, we deliver a comprehensive summary of your GitHub project's latest activity right to your inbox, including an overview of your project's issues, pull requests, contributors, and commit activity.
Table of Contents
I. News
1.1 Recent Version Releases:
The current version of this repository is v2.18.0
1.2 Other Noteworthy Updates:
II. Issues
2.1 Top 5 Active Issues:
We consider active issues to be issues that that have been commented on most frequently within the last week.
-
Division by zero error at random places if GPU is used: This issue involves a division by zero error occurring randomly when a program using TensorFlow's C API is executed on a GPU, specifically a Quadro RTX 6000, while it runs without issues on a CPU. The problem is challenging to diagnose as it appears to be related to deep interactions between TensorFlow and CUDA libraries, with inconsistent behavior even when the program is run multiple times under the same conditions.
- The comments discuss potential collaboration to resolve the issue, with one user offering to contribute and suggesting a screen-sharing session to better understand the problem. Another user suggests that the issue might be due to incompatible software versions and recommends updating them according to TensorFlow's documentation. There is also a discussion about scheduling a meeting to further investigate the issue, and an attempt to update CUDA libraries, which led to additional compatibility problems.
- Number of comments this week: 6
-
InvalidArgumentError when using MirroredStrategy but not with tf.distribute.get_strategy(): This issue involves an InvalidArgumentError encountered when using TensorFlow's MirroredStrategy, which does not occur when using tf.distribute.get_strategy(). The error arises specifically when deploying the code on multiple GPUs, and the user is seeking a solution to ensure the code runs without errors under MirroredStrategy, similar to its behavior with tf.distribute.get_strategy().
- The comments discuss attempts to reproduce the issue, with some users unable to replicate the error on different hardware setups. The original poster clarifies the error occurs with specific package versions and hardware, leading to a suggested workaround of using older versions of TensorFlow and Keras, which resolves the issue temporarily. The discussion concludes with a plan to investigate further for a permanent fix in future releases.
- Number of comments this week: 5
-
error: defining a type within 'offsetof' is a Clang extension [-Werror,-Wgnu-offsetof-extensions]: This issue involves a user encountering a build error while attempting to compile the Selective Framework for iOS using TensorFlow Lite, specifically related to a Clang extension warning about defining a type within 'offsetof'. The error occurs during the compilation of the 'upb.c' file, and the user is seeking assistance to resolve this problem as the build fails to complete successfully.
- The comments section includes a request for more detailed information about the steps and environment used by the user, including TensorFlow version and environment details. The user provides some additional steps they followed, and another commenter suggests trying a specific Bazel flag to bypass the warning. The conversation reflects ongoing troubleshooting efforts, with requests for further details to help replicate and resolve the issue.
- Number of comments this week: 5
-
The warning "The structure of
inputs
doesn't match the expected structure" when training a functional model: This issue is about a warning message that appears when training a functional model in TensorFlow, indicating a mismatch between the structure of the inputs and the expected structure, which causes uncertainty about the model's functionality. The user has identified that the warning persists regardless of the data type used and has traced the source of the issue to a specific line in the Keras library, but the training continues without apparent errors.- The comments discuss the cause of the warning, which is due to a mismatch between the input structure and what TensorFlow expects, and suggest solutions such as ensuring input data types match Keras expectations. A user points out that using tuples instead of lists can resolve the issue, but the problem reappears when the model is saved and loaded. Another user confirms that the suggested fix resolves the warning for them, while a different user reports not encountering the warning in their environment, indicating potential environment-specific behavior.
- Number of comments this week: 5
-
Tflite x86 lib and dll for windows: This issue is about a user attempting to build an x86 library for TensorFlow Lite on Windows, but encountering difficulties and requesting assistance in obtaining the necessary x86 library and DLL files. The user has tried following existing guides and suggestions but has not been successful, and is seeking further guidance or a direct provision of the files.
- The comments section involves a request for more details on the steps and environment used by the user, followed by suggestions to use CMake with specific commands. The user reports an error during the build process, and there is a discussion about the lack of official support for x86 builds on Windows, with a suggestion to try building from a specific commit. The conversation highlights the challenges of building TensorFlow Lite for x86 on Windows and the need for community or unofficial solutions.
- Number of comments this week: 4
2.2 Top 5 Stale Issues:
We consider stale issues to be issues that has had no activity within the last 30 days. The team should work together to get these issues resolved and closed as soon as possible.
As of our latest update, there are no stale issues for the project this week.
2.3 Open Issues
This section lists, groups, and then summarizes issues that were created within the last week in the repository.
Issues Opened This Week: 14
Summarized Issues:
- TensorFlow Lite and Android Integration Issues: Several issues have been reported regarding the integration of TensorFlow Lite with Android applications. One issue involves an unexpected failure during tensor allocations due to tensor dimension mismatches in the PAD operation. Another issue highlights undefined symbol errors when linking an Android static library with TensorFlow Lite's GPU delegate using CMake. Additionally, there is a compilation error related to an undefined symbol in the TensorFlow Lite library when using ndk-build for a custom C++ file.
- TensorFlow 2.18.0 Functionality and Compatibility Problems: Users have encountered various issues with TensorFlow 2.18.0, including a
ValueError
when using datasets with unknown shapes in themodel.evaluate
function. Another problem involves the 'height_shift_range' and 'width_shift_range' parameters being swapped in the ImageDataGenerator. Additionally, there is a request for the publication of TensorFlowLite version 2.18.0 to Maven Central and CocoaPods Specs.
- TensorFlow and PyCharm Integration Errors: Users have reported errors when running TensorFlow code in PyCharm, despite the code executing correctly. One issue involves PyCharm indicating that certain TensorFlow functions do not exist. Another issue is an ImportError due to a failure in loading the native TensorFlow runtime caused by a DLL initialization error.
- TensorFlow Compilation and Build Errors: Compilation errors have been reported when building TensorFlow from source on different platforms. One issue involves a non-class template declaration conflict in the
TrieRawHashMap.cpp
file on Windows 11. Another issue highlights a compatibility problem with the cross-compilation toolchain for building TensorFlow Lite on a Raspberry Pi Zero.
- TensorFlow GPU and CUDA Compatibility Issues: A division by zero error occurs randomly when using a GPU with TensorFlow's C API, specifically on a Quadro RTX 6000. This issue does not occur when the program is run on a CPU, suggesting potential incompatibilities between TensorFlow, CUDA libraries, and the GPU driver versions.
- TensorFlow Documentation and Warning Messages: There are issues related to TensorFlow's documentation and warning messages. One issue highlights a documentation bug in the
tf.raw_ops.MaxPoolGradWithArgmax
function regarding theargmax
parameter's supported data types. Another issue describes an annoying warning message that disrupts the readability of the console output during themodel.fit()
operation.
- TensorFlow Docker Image Updates: An issue has been raised about updating the Python version in TensorFlow's Docker images from the outdated
3.11.0rc1
to the stable3.11.x
release. This update is necessary to ensure compatibility and improved functionality.
2.4 Closed Issues
This section lists, groups, and then summarizes issues that were closed within the last week in the repository. This section also links the associated pull requests if applicable.
Issues Closed This Week: 25
Summarized Issues:
- TensorFlow 2.17 Bugs and Crashes: Several issues have been reported in TensorFlow version 2.17, where various operations such as
tf.raw_ops.Cholesky
,tf.raw_ops.MatrixDeterminant
, andtf.raw_ops.RaggedTensorToVariantGradient
cause crashes with "Aborted (core dumped)" errors when executed with specific inputs on a GPU. These problems have been confirmed to occur in TensorFlow Nightly as well, and workarounds have been suggested for some of them. The issues highlight the need for careful handling of input shapes and conditions to prevent such crashes.
- TensorFlow 2.17 and 2.16.1 Crashes: TensorFlow versions 2.17.0 and 2.16.1 have been reported to crash under specific conditions, such as using the
ConjugateTranspose
operation or theParameterizedTruncatedNormal
operation on Ubuntu 20.04. These crashes are often due to invalid input shapes or conditions, leading to heap-buffer-overflow or core dump errors. The issues have been resolved in TensorFlow 2.18.0 and nightly versions, indicating improvements in handling these operations.
- TensorFlow Lite and Android Compatibility Issues: There are several issues related to TensorFlow Lite on Android, including the absence of certain versions on Maven and execution failures due to unsupported opcodes. These issues create challenges for developers trying to implement or execute models on Android devices, often requiring updates or changes to the TensorFlow Lite version used. Community input and updates to the repository are sought to address these compatibility problems.
- Build Failures and Compiler Errors: Multiple issues have been reported regarding build failures and compiler errors in TensorFlow projects, often related to Gradle or MSVC. These issues include incorrect use of annotations or changes in the Microsoft Standard Library, leading to build failures or compiler warnings. Solutions often involve updating or downgrading tools or applying patches to the codebase.
- Numerical Precision and Performance Discrepancies: Issues have been raised about numerical precision discrepancies in TensorFlow operations like
tf.math.cumsum
and unexpected performance results when comparing different GPUs. These issues highlight the differences in precision across platforms and libraries, as well as the need for insights into hardware performance. Users seek solutions to achieve consistent precision and performance across different environments.
- TensorFlow 2.14 and 2.15 Issues: Users have reported issues with TensorFlow 2.14 and 2.15, including a bug in the
.fit()
method and significant app size increases on iOS. These issues involve incompatibilities and inefficiencies that users are trying to resolve without resorting to custom solutions. Community input and updates are sought to address these challenges.
- Miscellaneous Issues: Other issues include a request for TensorFlow Lite to limit GPU memory usage, a compiler warning in TensorFlow Lite, and a spam entry on a GitHub project. These issues vary in nature but highlight the diverse challenges and requests from the TensorFlow community. Solutions and updates are needed to address these varied concerns.
2.5 Issue Discussion Insights
This section will analyze the tone and sentiment of discussions within this project's open and closed issues that occurred within the past week. It aims to identify potentially heated exchanges and to maintain a constructive project environment.
Based on our analysis, there are no instances of toxic discussions in the project's open issues from the past week.
III. Pull Requests
3.1 Open Pull Requests
This section lists and summarizes pull requests that were created within the last week in the repository.
Pull Requests Opened This Week: 5
Pull Requests:
Key Open Pull Requests
1. [[ROCM] Add nanoo fp8 data type to cast op](https://github.com/tensorflow/tensorflow/pull/82748)
This pull request introduces the nanoo fp8 data type to the cast operation within the ROCm platform, building upon previous work to enhance TensorFlow's capabilities. The primary purpose of this update is to enable fp8 acceleration in the gemm rewriter pass in XLA, which is crucial for optimizing matrix multiplication operations. The changes include the addition of support for the nanoo fp8 data type, allowing for efficient dequantization and computation in fp32 before performing matrix multiplication. This enhancement is expected to be fused into a hipblaslt call, improving performance in relevant computational workflows.
Associated Commits:
Associated Commits: - cast op
2. [remove non-maintained tensorflow-io-gcs-filesystem dependency from pip_package](https://github.com/tensorflow/tensorflow/pull/82771)
The purpose of this pull request is to remove the non-maintained tensorflow-io-gcs-filesystem
dependency from the pip_package
in the TensorFlow project. This dependency has been problematic for over a year, with no viable solution in sight, as its wheels are not being built and it appears to be poorly maintained. TensorFlow only relies on this dependency for optional purposes, as indicated by its conditional inclusion in a previous pull request. The community has been actively seeking solutions to the issues caused by this dependency, but without success. Removing it is expected to enhance TensorFlow's multiplatform support and reliability, addressing concerns raised in various community discussions and issue reports.
Associated Commits:
Associated Commits: - [Update setup.py
remove non-maintained tensorflow-...](https://github.com/tensorflow/tensorflow/commit/f70aa7891715212f2b1f872ce334fec0d913fa31)
3. [fix to export symbol correctly on shared library for windows ( fix to bug )](https://github.com/tensorflow/tensorflow/pull/82798)
This pull request addresses a bug related to the incorrect export of symbols in shared libraries on Windows platforms when compiling without Bazel or MSYS. The issue arises from the use of the PUBLIC keyword in the compiler definition within the tensorflow/lite/CMakeLists.txt
file, which inadvertently affects the tensorflow/lite/c/CMakeLists.txt
compile definition. This results in a conflict where the intended SHARED library is incorrectly compiled as a STATIC library, leading to missing symbols in the tensorflowlite_c.dll
. The proposed fix involves modifying the keyword to prevent inheritance of definitions to other libraries or adjusting the condition statement in the c_api_types.h
file. More detailed information about the error can be found in the provided error history link.
Associated Commits:
Associated Commits: - fix to export symbol correctly on shared library f...
Other Open Pull Requests
- This pull request addresses the issue of two broken hyperlinks in the
audio_classifier.md
documentation file by updating them to functional links, specifically for the TensorFlow Lite Model Maker for Audio Classification and AudioRecord, and requests a review and merge of these changes. Fix 02 broken links in audio_classifier.md - This pull request addresses the issue of two broken hyperlinks in the
bert_nl_classifier.md
file by updating them to functional links, specifically for the TensorFlow Lite Model Maker for text classification and MobileBert documentation. Fix 02 broken links in bert_nl_classifier.md
3.2 Closed Pull Requests
This section lists and summarizes pull requests that were closed within the last week in the repository. Similar pull requests are grouped, and associated commits are linked if applicable.
Pull Requests Closed This Week: 13
Pull Requests:
Key Closed Pull Requests
1. [Update multinomial_op logits invalid arguments check description](https://github.com/tensorflow/tensorflow/pull/64651)
This pull request aims to enhance the error handling and messaging within the multinomial operation by updating the InvalidArgument
descriptions. The changes ensure that overflow checks are applied independently to the logits dimensions, specifically batch_size
and num_classes
, rather than just the overall logits.shape
. Additionally, the update incorporates the use of absl::InvalidArgumentError
for constructing error messages, aligning with reviewer suggestions for improved clarity and consistency.
Associated Commits:
Associated Commits: - op_requires operation for each logits index - rm dim_batch_size and dim_num_classes names - use absl::InvalidArgumentError in error messages
2. [Update pywrap_mlir.py](https://github.com/tensorflow/tensorflow/pull/77940)
This pull request focuses on enhancing the pywrap_mlir.py
module by introducing several key improvements. It adds input validation for parameters such as graphdef
, pass_pipeline
, and input_names
to ensure robustness. The documentation has been enriched with detailed docstrings that clarify the function arguments, return values, and potential exceptions. To improve cross-platform compatibility, file path handling has been updated to use Pathlib.Path
instead of string-based paths. The handling of default arguments for mutable lists, specifically output_names
, has been refined. Additionally, logging functionality has been incorporated to provide debug information when show_debug_info
is enabled. Lastly, the module now supports batch processing, allowing for the import of multiple GraphDef
objects simultaneously.
Associated Commits:
Associated Commits: - Update pywrap_mlir.py
3. [Error handling for missing variable nodes](https://github.com/tensorflow/tensorflow/pull/78042)
This pull request focuses on improving error handling and memory management within the TensorFlow codebase. It introduces enhanced error logging for scenarios involving missing nodes or variable issues, utilizing the LOG(ERROR) mechanism. Additionally, it optimizes memory handling by eliminating the use of a static variable, kVariableTypes, and replacing it with a local unordered_set. The update also includes modifications to ensure compatibility with modern versions of TensorFlow by fixing outdated API calls. These changes aim to enhance the robustness and maintainability of the code.
Associated Commits:
Associated Commits: - Error handling for missing variable nodes
Other Closed Pull Requests
- This pull request involves adding a Keras import specifically for type checking to ensure that linting and syntax highlighting function correctly for the dynamically imported Keras library, along with a fix for indentation to pass PyLint checks. Added Keras Import for Linting
- This pull request addresses an integer overflow issue in the
ragged_range_op.cc
file of the TensorFlow project by modifying previous fixes from rolled-back pull requests and adding a condition to prevent unnecessary value increments during the final iteration of a loop, which was detected by ASan but did not affect the operation's output or test results. Fix integer overflow in range - This pull request addresses and corrects several typographical errors in the cumulative_logsumexp function within the TensorFlow project, as detailed in the commit linked at https://github.com/tensorflow/tensorflow/commit/b362f96437cb3a5b9a94eac88f958d31066c9db8. Fix typos in cumulative_logsumexp
- This pull request updates the
tensor_util_test.py
file to enhance the test forbfloat16
by adding support for the s390x architecture, which uses big-endian byte order, in addition to the existing default little-endian check, as detailed in the commit found at the provided URL. Update tensor_util_test.py for testBfloat16 - This pull request involves the addition of new headers to improve code clarity and readability by breaking up a long function, as part of addressing an issue in the TensorFlow project. Updating Headers
- This pull request updates the .clang-format file in the TensorFlow project to allow customization of the code formatting style based on Google's style guide and to disable automatic pointer alignment derivation, providing instructions for further modifications. Update .clang-format
- This pull request involves updating the
abstract_context.h
file by replacingenum
withenum class
to enhance type safety and provide scoped enumeration, as well as usingstd::string
instead ofstring
for improved clarity and compatibility. Update abstract_context.h - This pull request involves updating the
cc_op_gen.h
file by adding descriptive comments to clarify the functionality of theWriteCCOps
function and replacingstring
withstd::string
to align with modern C++ standards for clarity and consistency. Update cc_op_gen.h - This pull request adds examples to the
tf.math.truediv
function in the TensorFlow library, as detailed in the commits and described in the pull request body. Example is added to tf.math.truediv function - This pull request involves updating the
.clang-format
file in the TensorFlow project, as indicated by the commit message, although the title and body of the pull request are labeled as 'spam'. spam
3.3 Pull Request Discussion Insights
This section will analyze the tone and sentiment of discussions within this project's open and closed pull requests that occurred within the past week. It aims to identify potentially heated exchanges and to maintain a constructive project environment.
- Update .clang-format
- Toxicity Score: 0.55 (Defensive responses, persistent disagreement, escalating tension.)
- This GitHub conversation involves username1 expressing dissatisfaction with the proposed changes, while username2 responds defensively, leading to a tense exchange. The tone shifts from collaborative to confrontational as username1 insists on a different approach, and username2 becomes increasingly frustrated.
IV. Contributors
4.1 Contributors
Active Contributors:
We consider an active contributor in this project to be any contributor who has made at least 1 commit, opened at least 1 issue, created at least 1 pull request, or made more than 2 comments in the last month.
Contributor | Commits | Pull Requests | Issues | Comments |
---|---|---|---|---|
gaikwadrahul8 | 13 | 11 | 1 | 88 |
tilakrayal | 0 | 0 | 0 | 51 |
Venkat6871 | 3 | 3 | 0 | 43 |
mihaimaruseac | 0 | 1 | 0 | 21 |
LongZE666 | 0 | 0 | 9 | 5 |
LakshmiKalaKadali | 4 | 1 | 0 | 7 |
x0w3n | 0 | 0 | 7 | 5 |
pkgoogle | 0 | 0 | 0 | 12 |
yuvashrikarunakaran | 0 | 3 | 0 | 5 |
phpYj | 0 | 0 | 2 | 5 |
ReadMe Summary: TensorFlow is an open-source platform for machine learning, offering a flexible ecosystem of tools and libraries for researchers and developers to build and deploy ML applications. It supports Python and C++ APIs, with installation options for GPU and CPU, and provides nightly binaries for testing. Stay updated with release announcements and contribute to the project by following the guidelines and engaging with the community through various forums and resources.