Weekly GitHub Report for Tensorflow: December 07, 2024 - December 14, 2024
Weekly GitHub Report for Tensorflow
Thank you for subscribing to our weekly newsletter! Each week, we deliver a comprehensive summary of your GitHub project's latest activity right to your inbox, including an overview of your project's issues, pull requests, contributors, and commit activity.
Table of Contents
I. News
1.1 Recent Version Releases:
The current version of this repository is v2.18.0
1.2 Other Noteworthy Updates:
II. Issues
2.1 Top 5 Active Issues:
We consider active issues to be issues that that have been commented on most frequently within the last week.
-
Division by zero error at random places if GPU is used: This issue involves a division by zero error occurring randomly when a program using TensorFlow's C API is executed on a GPU, specifically a Quadro RTX 6000, while it runs without issues on a CPU. The problem is challenging to diagnose as it appears to be related to deep interactions between TensorFlow and CUDA libraries, with inconsistent behavior observed even when the program is run multiple times under the same conditions.
- The comments discuss potential collaboration to resolve the issue, with one user offering to contribute and suggesting a screen-sharing session to better understand the problem. Another user suggests that the issue might be due to incompatible software versions and advises updating them according to TensorFlow's documentation. There is also a discussion about scheduling a meeting to further investigate the issue, and an attempt to update CUDA libraries is mentioned, which led to a new error related to version mismatches.
- Number of comments this week: 6
-
InvalidArgumentError when using MirroredStrategy but not with tf.distribute.get_strategy(): This issue involves an InvalidArgumentError encountered when using TensorFlow's MirroredStrategy, which does not occur when using tf.distribute.get_strategy(). The error arises specifically when deploying the code on multiple GPUs, and the user is seeking a solution to ensure the code runs without errors under MirroredStrategy.
- The comments reveal that the issue was initially not reproducible by others, but further investigation showed it was related to specific TensorFlow and Keras version combinations. A temporary workaround was found by downgrading to TensorFlow 2.16.1 and Keras 3.3.0, which resolved the error. The maintainers acknowledged the issue and plan to address it in future releases, while the user confirmed the workaround was effective and inquired about a permanent fix.
- Number of comments this week: 5
-
error: defining a type within 'offsetof' is a Clang extension [-Werror,-Wgnu-offsetof-extensions]: This issue involves a user encountering a build error while attempting to compile the Selective Framework for iOS using TensorFlow Lite, specifically related to a Clang extension warning about defining a type within 'offsetof'. The error occurs during the compilation of the 'upb.c' file, and the user is seeking assistance to resolve this problem as they are new to the process.
- The comments section includes a request for more detailed information about the steps and environment used by the user, followed by the user providing some additional details about their setup. Another commenter suggests a potential solution by adding a specific flag to the bazel command to bypass the warning, and requests further information if the issue persists. There is also a follow-up asking for updates on the issue's status.
- Number of comments this week: 5
-
The warning "The structure of
inputs
doesn't match the expected structure" when training a functional model: This issue involves a warning message encountered when training a functional model in TensorFlow, indicating a mismatch between the structure of the inputs and the expected structure, which raises concerns about the model's functionality despite the training proceeding normally. The user has identified that the warning arises from a comparison failure in the code and has provided a standalone code snippet to reproduce the issue, noting that the warning persists regardless of the data format used.- The comments discuss the cause of the warning, which is due to a mismatch between the input structure and TensorFlow's expectations, and suggest code modifications to resolve it. A user points out that using tuples instead of lists for inputs can avoid the warning, but saving and reloading the model reintroduces the issue. Another user confirms that the suggested fix resolves the warning for them, while a different user reports not encountering the warning in a different environment, prompting a discussion about environmental differences.
- Number of comments this week: 5
-
Tflite x86 lib and dll for windows: This issue is about a user who is attempting to build an x86 library and DLL for TensorFlow Lite on Windows but is encountering difficulties and is requesting assistance or pre-built files. The user has tried following existing guides and suggestions but has not been successful in resolving the issue.
- The comments section involves a request for more details on the steps and environment used, suggestions to follow official documentation, and specific CMake commands to try. The user reports errors during the build process and requests pre-built files. It is noted that Bazel does not support x86 builds on Windows, and the user is advised to try building from a specific commit using CMake, as there is no official x86 build available.
- Number of comments this week: 4
2.2 Top 5 Stale Issues:
We consider stale issues to be issues that has had no activity within the last 30 days. The team should work together to get these issues resolved and closed as soon as possible.
As of our latest update, there are no stale issues for the project this week.
2.3 Open Issues
This section lists, groups, and then summarizes issues that were created within the last week in the repository.
Issues Opened This Week: 14
Summarized Issues:
- TensorFlow Runtime Errors: Several issues have been reported regarding runtime errors in TensorFlow across different environments. One issue involves an unexpected failure during tensor allocations in an Android Studio application using TensorFlow Lite, caused by tensor dimension mismatches. Another issue describes a division by zero error occurring randomly on a GPU with TensorFlow's C API, which does not occur on a CPU. Additionally, an ImportError is encountered in PyCharm due to a failure in loading the native TensorFlow runtime, caused by a DLL initialization error.
- Compilation and Linking Issues: Users have faced various compilation and linking issues when working with TensorFlow and TensorFlow Lite. A compile error is reported when building TensorFlow from source on Windows 11 due to a non-class template declaration conflict. Another issue involves undefined symbol errors when linking an Android static library with TensorFlow Lite's GPU delegate using CMake on Ubuntu. Additionally, a compilation error occurs when using ndk-build for a custom C++ file, indicating a problem with linking the TensorFlow Lite library in an Android environment.
- TensorFlow Functionality and Compatibility: Several issues highlight functionality and compatibility problems in TensorFlow. A
ValueError
is encountered when using a dataset with an unknown shape in themodel.evaluate
function, which did not occur in earlier versions. Another issue involves a bug in TensorFlow 2.18.0 where the 'height_shift_range' and 'width_shift_range' parameters in the ImageDataGenerator are swapped. Additionally, a compatibility problem arises with the cross-compilation toolchain for building TensorFlow Lite on a Raspberry Pi Zero, due to architecture mismatches.
- Documentation and Versioning: Issues related to documentation and versioning have been identified in TensorFlow. A documentation bug is reported in the
tf.raw_ops.MaxPoolGradWithArgmax
function, where theargmax
argument is incorrectly documented to accept bothint32
andint64
data types. Another issue requests the update of the Python version in TensorFlow's Docker images to ensure compatibility and improved functionality. Additionally, there is a request for the publication of TensorFlowLite version 2.18.0 to Maven Central and CocoaPods Specs for Android and iOS platforms.
- Development Environment Errors: Errors in development environments have been reported by users working with TensorFlow. An error is encountered when running TensorFlow 2.18.0 code in PyCharm on Ubuntu, where the code executes correctly despite PyCharm indicating missing functions. Another issue describes an annoying warning message that appears during the
model.fit()
operation when using Keras 3.x with TensorFlow 2.17+, disrupting console output readability.
2.4 Closed Issues
This section lists, groups, and then summarizes issues that were closed within the last week in the repository. This section also links the associated pull requests if applicable.
Issues Closed This Week: 25
Summarized Issues:
- TensorFlow 2.17 Bugs and Crashes: Several issues have been reported in TensorFlow 2.17, where operations like
tf.raw_ops.Cholesky
,tf.raw_ops.MatrixDeterminant
, andtf.raw_ops.RaggedTensorToVariantGradient
cause crashes with "Aborted (core dumped)" errors when executed with specific inputs on a GPU. These problems have been confirmed in TensorFlow Nightly as well, indicating a broader issue with the version. Users have suggested workarounds and are seeking fixes to prevent these crashes.
- TensorFlow Lite and Android Issues: Developers have encountered various issues with TensorFlow Lite on Android, including build failures due to Gradle mismatches and missing versions on Maven. These issues hinder the development process, prompting requests for better support and updates. Additionally, there are concerns about compatibility between TensorFlow and TensorFlow Lite versions.
- Numerical Precision and Performance Discrepancies: Issues have been raised regarding numerical precision discrepancies in TensorFlow's
tf.math.cumsum
operation and unexpected GPU performance results. These discrepancies highlight differences in precision across platforms and libraries, with TensorFlow showing varied results compared to NumPy and PyTorch. Users are seeking insights into these precision and performance issues to optimize their workflows.
- Compilation and Build Errors: Several issues involve compilation and build errors in TensorFlow projects, including a compiler warning in TensorFlow Lite's
graph_info.h
and a compiler error with MSVC due to changes in the C++ STL. These errors disrupt the build process, requiring patches and adjustments to resolve them. Developers are actively seeking solutions to these technical challenges.
- TensorFlow 2.17.0 and 2.16.1 Crashes: TensorFlow versions 2.17.0 and 2.16.1 have been reported to crash on Ubuntu 20.04 with operations like
SparseTensorDenseMatMul
andUnBatch
. These crashes are linked to invalid input shapes and specific conditions, causing significant disruptions for users. The issues have been resolved in TensorFlow 2.18.0, but users are still seeking stable solutions for earlier versions.
- TensorFlow 2.18 GPU Utilization Issue: A bug in TensorFlow 2.18 prevents the software from utilizing the GPU for certain operations on a Linux system, despite the GPU being recognized. This issue affects performance and has been resolved by updating the LD_LIBRARY_PATH. Users are advised to ensure their environment variables are correctly set to avoid similar problems.
- Spam and Misleading Content on GitHub: A spam entry on a GitHub project falsely advertises the availability of a movie, "Pushpa 2 The Rule," for free download or streaming. This misleading content includes false links and information, prompting the community to address and remove such spam to maintain the integrity of the platform.
2.5 Issue Discussion Insights
This section will analyze the tone and sentiment of discussions within this project's open and closed issues that occurred within the past week. It aims to identify potentially heated exchanges and to maintain a constructive project environment.
Based on our analysis, there are no instances of toxic discussions in the project's open issues from the past week.
III. Pull Requests
3.1 Open Pull Requests
This section lists and summarizes pull requests that were created within the last week in the repository.
Pull Requests Opened This Week: 5
Pull Requests:
Key Open Pull Requests
This pull request introduces the nanoo fp8 data type to the cast operation within the ROCm platform, building upon previous work to enhance TensorFlow's capabilities. The primary purpose of this update is to enable fp8 acceleration in the gemm rewriter pass in XLA, which is crucial for optimizing matrix multiplication operations. The changes include the addition of support for the nanoo fp8 data type, allowing for efficient dequantization and computation in fp32 before performing matrix multiplication. This enhancement is expected to improve performance by fusing operations into a hipblaslt call, as demonstrated in the provided test case.
Associated Commits:
This pull request aims to remove the non-maintained tensorflow-io-gcs-filesystem
dependency from the pip_package
due to ongoing issues and lack of support. The dependency has been problematic for over a year, with no viable solutions or updates, and is only required for optional purposes in TensorFlow. The community has struggled to address the issues caused by this dependency, and its removal is expected to enhance TensorFlow's multiplatform support and reliability. The decision is supported by multiple discussions and issues raised within the community, highlighting the need for this change.
Associated Commits:
- Update setup.py\n\nremove non-maintained tensorflow-io-gcs... URL: https://github.com/tensorflow/tensorflow/pull/82771 Associated Commits:
- [Update setup.py
remove non-maintained tensorflow-...](https://github.com/tensorflow/tensorflow/commit/f70aa7891715212f2b1f872ce334fec0d913fa31)
This pull request addresses a bug related to the incorrect export of symbols in shared libraries on Windows platforms when compiling without Bazel or MSYS. The issue arises from the use of the PUBLIC keyword in the compiler definition within the tensorflow/lite/CMakeLists.txt
file, which inadvertently affects the tensorflow/lite/c/CMakeLists.txt
compile definition. This results in a conflict where the intended SHARED library is incorrectly compiled as a STATIC library, leading to missing symbols in the tensorflowlite_c.dll
. The proposed fix involves modifying the keyword to prevent the inheritance of definitions to other libraries or adjusting the condition statement in the c_api_types.h
file. More detailed information about the error can be found in the provided error history link.
Associated Commits:
- fix to export symbol correctly on shared library... URL: https://github.com/tensorflow/tensorflow/pull/82798 Associated Commits:
- fix to export symbol correctly on shared library f...
Other Open Pull Requests
This pull request addresses the issue of two broken hyperlinks in the audio_classifier.md
documentation file by updating them to functional links, specifically for the TensorFlow Lite Model Maker for Audio Classification and AudioRecord, and requests a review and merge of these changes.
URL: https://github.com/tensorflow/tensorflow/pull/82841
This pull request addresses the issue of two broken hyperlinks in the bert_nl_classifier.md
file by updating them to functional links, specifically for the TensorFlow Lite Model Maker for text classification and MobileBert documentation.
URL: https://github.com/tensorflow/tensorflow/pull/82845
3.2 Closed Pull Requests
This section lists and summarizes pull requests that were closed within the last week in the repository. Similar pull requests are grouped, and associated commits are linked if applicable.
Pull Requests Closed This Week: 13
Pull Requests:
Key Closed Pull Requests
This pull request aims to enhance the error handling and messaging within the multinomial operation by updating the InvalidArgument
descriptions. The changes ensure that overflow checks are applied independently to the logits dimensions, specifically batch_size
and num_classes
, rather than just the overall logits.shape
. Additionally, the update incorporates the use of absl::InvalidArgumentError
for constructing error messages, aligning with reviewer suggestions. These improvements are intended to provide clearer and more precise error reporting for developers working with the multinomial operation.
Associated Commits:
- use absl::InvalidArgumentError in error messages URL: https://github.com/tensorflow/tensorflow/pull/64651 Associated Commits:
- op_requires operation for each logits index
- rm dim_batch_size and dim_num_classes names
- use absl::InvalidArgumentError in error messages
This pull request focuses on enhancing the pywrap_mlir.py
module by introducing several key improvements. It adds input validation for parameters such as graphdef
, pass_pipeline
, and input_names
to ensure robustness. The documentation has been enriched with detailed docstrings that clarify the function arguments, return values, and potential exceptions. To improve cross-platform compatibility, string-based file paths have been replaced with Pathlib.Path
. The handling of default arguments for mutable lists, specifically output_names
, has been updated to prevent unintended side effects. Additionally, logging functionality has been incorporated to provide debug information when show_debug_info
is enabled. Lastly, the module now supports batch processing, allowing multiple GraphDef
objects to be processed simultaneously, enhancing efficiency.
Associated Commits:
- Update pywrap_mlir.py URL: https://github.com/tensorflow/tensorflow/pull/77940 Associated Commits:
- Update pywrap_mlir.py
This pull request focuses on improving error handling and memory management within the TensorFlow codebase. It introduces enhanced error logging for scenarios where nodes or variables are missing, utilizing the LOG(ERROR) mechanism to provide clearer diagnostics. Additionally, it optimizes memory handling by eliminating the use of a static variable, kVariableTypes, and replacing it with a local unordered_set, which enhances efficiency and reduces potential memory issues. The update also includes modifications to outdated TensorFlow API calls, ensuring that the code remains compatible with the latest versions of TensorFlow, thereby improving overall stability and performance.
Associated Commits:
- Error handling for missing variable nodes URL: https://github.com/tensorflow/tensorflow/pull/78042 Associated Commits:
- Error handling for missing variable nodes
Other Closed Pull Requests
This pull request involves adding a Keras import specifically for type checking to ensure that linting and code highlighting function correctly for the dynamically imported Keras library, and it also includes a fix for indentation to pass PyLint checks.
URL: https://github.com/tensorflow/tensorflow/pull/78837
This pull request addresses an integer overflow issue in the ragged_range_op.cc
file of the TensorFlow project by modifying previous fixes from rolled-back pull requests and adding a condition to prevent unnecessary value increments during the final iteration of a loop, which was detected by ASan but did not affect the operation's output or test results.
URL: https://github.com/tensorflow/tensorflow/pull/80133
This pull request addresses and corrects several typographical errors in the cumulative_logsumexp function within the TensorFlow project, as detailed in the commit found at https://github.com/tensorflow/tensorflow/commit/b362f96437cb3a5b9a94eac88f958d31066c9db8.
URL: https://github.com/tensorflow/tensorflow/pull/81030
This pull request updates the tensor_util_test.py
file to enhance the test for bfloat16
by adding support for the s390x architecture, which uses big-endian byte order, in addition to the existing default little-endian check, as detailed in the commit found at this link.
URL: https://github.com/tensorflow/tensorflow/pull/82286
This pull request involves the addition of new headers to improve code clarity and readability by breaking up a long function, as detailed in the related issue on the TensorFlow GitHub repository.
URL: https://github.com/tensorflow/tensorflow/pull/82438
This pull request updates the .clang-format file in the TensorFlow project to allow customization of the code formatting style based on Google's style guide and to disable automatic pointer alignment derivation, providing instructions for further modifications.
URL: https://github.com/tensorflow/tensorflow/pull/82440
This pull request involves updating the abstract_context.h
file by replacing an enum
with an enum class
to enhance type safety and provide scoped enumeration, as well as substituting string
with std::string
for improved clarity and compatibility, as detailed in the commit found at https://github.com/tensorflow/tensorflow/commit/4f08b531272868c360a3ec704815823622af43b7.
URL: https://github.com/tensorflow/tensorflow/pull/82441
This pull request involves updating the cc_op_gen.h
file by adding descriptive comments to clarify the functionality of the WriteCCOps
function and replacing string
with std::string
to align with modern C++ standards for clarity and consistency.
URL: https://github.com/tensorflow/tensorflow/pull/82442
This pull request adds examples to the tf.math.truediv
function in the TensorFlow project, as detailed in the commits which include initial additions and subsequent updates to the math_ops.py
file.
URL: https://github.com/tensorflow/tensorflow/pull/82466
This pull request involves updating the .clang-format
file in the TensorFlow project, as indicated by the commit message, although the title and body of the pull request are labeled as 'spam'.
URL: https://github.com/tensorflow/tensorflow/pull/82745
3.3 Pull Request Discussion Insights
This section will analyze the tone and sentiment of discussions within this project's open and closed pull requests that occurred within the past week. It aims to identify potentially heated exchanges and to maintain a constructive project environment.
Based on our analysis, there are no instances of toxic discussions in the project's open pull requests from the past week.
IV. Contributors
4.1 Contributors
Active Contributors:
We consider an active contributor in this project to be any contributor who has made at least 1 commit, opened at least 1 issue, created at least 1 pull request, or made more than 2 comments in the last month.
Contributor | Commits | Pull Requests | Issues | Comments |
---|---|---|---|---|
gaikwadrahul8 | 13 | 11 | 1 | 88 |
tilakrayal | 0 | 0 | 0 | 51 |
Venkat6871 | 3 | 3 | 0 | 43 |
mihaimaruseac | 0 | 1 | 0 | 21 |
LongZE666 | 0 | 0 | 9 | 5 |
LakshmiKalaKadali | 4 | 1 | 0 | 7 |
x0w3n | 0 | 0 | 7 | 5 |
pkgoogle | 0 | 0 | 0 | 12 |
yuvashrikarunakaran | 0 | 3 | 0 | 5 |
phpYj | 0 | 0 | 2 | 5 |
ReadMe Summary: TensorFlow is an open-source platform for machine learning, offering a flexible ecosystem of tools and libraries for researchers and developers to build and deploy ML applications. It supports Python and C++ APIs, with installation options for GPU support and Docker containers, and provides nightly binaries for testing. Stay updated with release announcements and contribute to the project by following the guidelines and engaging with the community through various forums and resources.