Weekly Project News

Subscribe
Archives

Weekly GitHub Report for Tensorflow: July 28, 2025 - August 04, 2025 (12:03:10)

Weekly GitHub Report for Tensorflow

Thank you for subscribing to our weekly newsletter! Each week, we deliver a comprehensive summary of your GitHub project's latest activity right to your inbox, including an overview of your project's issues, pull requests, contributors, and commit activity.


Table of Contents

  • I. News
    • 1.1. Recent Version Releases
    • 1.2. Other Noteworthy Updates
  • II. Issues
    • 2.1. Top 5 Active Issues
    • 2.2. Top 5 Stale Issues
    • 2.3. Open Issues
    • 2.4. Closed Issues
    • 2.5. Issue Discussion Insights
  • III. Pull Requests
    • 3.1. Open Pull Requests
    • 3.2. Closed Pull Requests
    • 3.3. Pull Request Discussion Insights
  • IV. Contributors
    • 4.1. Contributors

I. News

1.1 Recent Version Releases:

The current version of this repository is v2.19.0

1.2 Version Information:

Released on March 5, 2025, TensorFlow version 2.19.0 introduces breaking changes to the TensorFlow Lite (tf.lite) API, including the deprecation of tf.lite.Interpreter in Python with a migration path to ai_edge_litert.interpreter, and changes to certain C++ API constants for improved compatibility. Key updates also include runtime support for the bfloat16 data type in the tfl.Cast operation, alongside the discontinuation of standalone libtensorflow package publishing, while still allowing unpacking from PyPI.

II. Issues

2.1 Top 5 Active Issues:

We consider active issues to be issues that that have been commented on most frequently within the last week. Bot comments are omitted.

  1. Mixed precision results in lower performance on AMD GPUs: This issue reports that enabling mixed precision with the policy mixed_float16 on AMD GPUs results in significantly worse performance, with step times increasing from approximately 210ms to 3 seconds. The user provides a reproducible example using TensorFlow 2.18.1 on Ubuntu 24.04.2 LTS (WSL2) and notes that this performance degradation occurs despite no custom code or specific GPU details being provided.

    • The comments discuss a possible cause related to AMD GPUs lacking optimized float16 kernels, leading ROCm to cast computations to float32 and back, which could slow performance. However, this explanation is questioned because other frameworks like PyTorch and MIGraphX reportedly achieve much faster FP16 inference on the same hardware, and further attempts to diagnose the issue have not yielded a clear cause.
    • Number of comments this week: 1
  2. tf.image.combined_non_max_suppression crashed with unmatched scores: This issue reports a crash occurring in the TensorFlow function tf.image.combined_non_max_suppression when the scores tensor provided does not match the expected shape, leading to a tensor shape check failure and program abort. The user provides a minimal reproducible example showing that supplying a scores tensor of shape [batch_size] instead of the required [batch_size, num_boxes, num_classes] causes the function to fail with a core dump.

    • The comment explains that the root cause is the incorrect shape of the scores tensor, which must be three-dimensional as per the TensorFlow documentation. The responder confirms the issue by testing with the correct shape and provides a reference gist to help the user fix the problem.
    • Number of comments this week: 1
  3. Mismatch Between Quantized TFLite Layer Outputs and Expected Mathematical Values: This issue concerns a discrepancy between the outputs obtained from intermediate layers of a quantized TFLite model using the TFLite interpreter and the mathematically expected values calculated manually, despite accounting for quantization parameters. The user seeks confirmation on whether their approach to extracting intermediate outputs via interpreter._get_ops_details() and interpreter.get_tensor() is correct and inquires about any limitations of these methods when applied to quantized models.

    • The comment explains that get_tensor() cannot be used to read intermediate tensor results as per the official documentation, indicating a limitation in accessing all intermediate tensors through this method. The commenter requests a complete reproducible example including the quantized model to further investigate the issue.
    • Number of comments this week: 1
  4. Issue while converting a model with GRU to .tflite: This issue concerns difficulties encountered when converting a GRU-based model to the TensorFlow Lite (.tflite) format, specifically related to the presence of a while loop in the converted model and challenges running it with TensorFlow Lite Micro (TFLM). The user is seeking clarification on the effects of the "stateful" and "unroll" arguments in the GRU layer, particularly how these settings influence the decomposition of the GRU, the presence of quantization and dequantization operations, and the resulting data types in the converted model.

    • The user shared a text version of their GRU model code after being unable to upload the original Python file, and described their observations and questions about the model conversion process, focusing on the behavior of the "stateful" and "unroll" parameters and their impact on the TFLite model's structure and quantization.
    • Number of comments this week: 1
  5. EXC_BAD_ACCESS when using GemmImplUsingEigen in MacOS: This issue reports a crash with an EXC_BAD_ACCESS error occurring on MacOS when using the GemmImplUsingEigen component in TensorFlow 2.19.0 while compiling a shared library for the Blosc2-Btune project. The user notes that the same code works on Linux and Windows, and previously worked on MacOS with TensorFlow 2.14.0, but now fails even with the older TensorFlow version due to what appears to be an incompatibility with the modern MacOS toolchain.

    • The single comment describes an attempt to build the project using the main TensorFlow branch, which fails during CMake configuration due to a missing build target for "xnnpack-delegate," resulting in a build error and wheel creation failure.
    • Number of comments this week: 1

2.2 Top 5 Stale Issues:

We consider stale issues to be issues that has had no activity within the last 30 days. The team should work together to get these issues resolved and closed as soon as possible.

  1. TF-TRT Warning: Could not find TensorRT: This issue describes a bug where the user encounters a warning indicating that TensorRT could not be found when running TensorFlow on an Ubuntu 22.04 system with an RTX 3050 Ti GPU. Despite using the NVIDIA 535 driver and CUDA 12.4, the user is unable to resolve the error after multiple reinstallations, and seeks assistance to fix the TensorRT detection problem in their Anaconda environment.
  2. SystemError in tf.ensure_shape and tf.compat.v1.ensure_shape when dtype of shape is tf.uint64 and its value is too large.: This issue reports a bug in TensorFlow where using tf.ensure_shape or tf.compat.v1.ensure_shape with a shape tensor of type tf.uint64 containing very large values close to 2^64 causes a SystemError and OverflowError. Specifically, when such large uint64 values are passed in eager execution mode, the functions fail with an internal error related to the built-in isinstance function, indicating improper handling of large unsigned integer shapes.
  3. Feature Request: Integrate different Digital Signal Processing into tf.signal: This issue is a feature request proposing the integration of advanced digital signal processing (DSP) functionalities, similar to those found in the julius library, into TensorFlow's tf.signal module. The requester highlights the current lack of sophisticated audio data augmentation tools within TensorFlow compared to PyTorch and suggests that adding these capabilities would enhance audio model training by enabling native, efficient preprocessing and augmentation workflows.
  4. [DOCS] Missing complex input for Round op: This issue highlights a discrepancy in the TensorFlow documentation for the Round operation, where it states that complex tensors are supported as input, but in practice, attempting to use a complex tensor with this operation results in an error. The user reports that they must manually apply the Round operation to the real and imaginary parts separately, indicating that the current implementation does not handle complex inputs as documented, which suggests a documentation bug that needs correction.
  5. tf.raw_ops.Unbatch aborts with "Check failed: d < dims()": This issue reports a bug in TensorFlow version 2.17 where the tf.raw_ops.Unbatch operation aborts with a fatal check failure error "Check failed: d < dims()" when invoked with certain inputs. The problem occurs on Linux Ubuntu 20.04.3 LTS using Python 3.11.8, and the user has provided a minimal reproducible example demonstrating that the operation crashes due to an invalid dimension check in the tensor shape handling code.

2.3 Open Issues

This section lists, groups, and then summarizes issues that were created within the last week in the repository.

Issues Opened This Week: 10

Summarized Issues:

  • Tensor shape and format errors: Several issues report crashes or failures due to incorrect tensor shapes or formats, such as the tf.image.combined_non_max_suppression function failing when given a scores tensor with an unexpected shape. These shape mismatches lead to tensor shape check failures and program aborts, highlighting the importance of strict input format adherence.
  • issues/97672
  • Quantization and intermediate tensor correctness: There are concerns about the accuracy of intermediate outputs extracted from quantized TensorFlow Lite models using the TFLite interpreter’s get_tensor() method. The extracted values do not match mathematically expected results after accounting for quantization parameters, raising questions about the method's correctness and limitations.
  • issues/97677
  • Gradient computation and memory management bugs: Issues include a memory leak caused by the @custom_gradient decorator capturing objects that prevent garbage collection, and a feature request for deterministic tie-breaking in tf.math.reduce_min gradient computations. These problems affect both resource usage and the reproducibility of gradient calculations.
  • issues/97688, issues/97697
  • GPU and CUDA related crashes and precision errors: Multiple issues describe GPU-specific problems, including out-of-memory crashes when running tf.linalg.eigh on CUDA, precision discrepancies between GPU and CPU results in segment operations, and GPU delegate tensor extraction bugs causing tensor corruption and accuracy degradation. These highlight challenges in GPU memory management, numerical precision, and delegate implementation.
  • issues/97780, issues/97805, issues/98103
  • Model conversion and compatibility challenges: Difficulties arise when converting GRU-based TensorFlow models to TensorFlow Lite, especially regarding while loops versus unrolled operations and the effects of "stateful" and "unroll" arguments on quantization and operator decomposition. These issues complicate model behavior understanding and compatibility with TensorFlow Lite Micro.
  • issues/97941
  • Platform-specific crashes and incompatibilities: A crash on MacOS involving the GemmImplUsingEigen component during shared library compilation contrasts with successful builds on Linux and Windows, suggesting incompatibilities between TensorFlow’s implementation and the MacOS toolchain. This points to platform-specific stability issues.
  • issues/98002
  • GPU support detection failures: Installation of TensorFlow 2.19.0 on an Alma Linux server with CUDA 12.3 and an NVIDIA GH200 GPU fails to enable GPU support, as indicated by tf.test.is_built_with_gpu_support() returning false despite proper hardware and drivers. This indicates issues in build or detection mechanisms for GPU support on certain platforms.
  • issues/98115

2.4 Closed Issues

This section lists, groups, and then summarizes issues that were closed within the last week in the repository. This section also links the associated pull requests if applicable.

Issues Closed This Week: 4

Summarized Issues:

  • GPU Detection and Utilization Issues: TensorFlow fails to detect or utilize available NVIDIA GPUs despite correct hardware and software setup, resulting in no GPUs being listed and limiting model training performance. This issue affects users relying on GPU acceleration for efficient computation.
  • issues/96707
  • Version Compatibility and Module Import Errors: Downgrading TensorFlow from version 2.19.0 to earlier versions resolves some malfunctions but causes a ModuleNotFoundError for the 'tensorflow.keras' module, which should be included in all these versions. Additionally, importing TensorFlow's internal modules on Windows 10 can fail due to DLL load errors caused by Python version incompatibility, which can be fixed by using Python 3.10.9 in a new environment.
  • issues/97620, issues/97812, issues/97812
  • TensorFlow Lite Memory Page Size Support: Certain TensorFlow Lite libraries do not support 16 KB memory page sizes required by newer Android devices, potentially causing deployment issues and Play Store submission failures. Users seek information on plans or workarounds to address this limitation.
  • issues/97935

2.5 Issue Discussion Insights

This section will analyze the tone and sentiment of discussions within this project's open and closed issues that occurred within the past week. It aims to identify potentially heated exchanges and to maintain a constructive project environment.

Based on our analysis, there are no instances of toxic discussions in the project's open or closed issues from the past week.


III. Pull Requests

3.1 Open Pull Requests

This section provides a summary of pull requests that were opened in the repository over the past week. The top three pull requests with the highest number of commits are highlighted as 'key' pull requests. Other pull requests are grouped based on similar characteristics for easier analysis. Up to 25 pull requests are displayed in this section, while any remaining pull requests beyond this limit are omitted for brevity.

Pull Requests Opened This Week: 7

Key Open Pull Requests

1. [tosa] updated scatter_nd to support duplicate indices: This pull request updates the TOSA scatter_nd operation to support duplicate indices by creating individual scatters with unique indices, summing the resulting tensors, and casting intermediate tensors to supported types such as f32 and i32 when necessary.

  • URL: pull/97717
  • Merged: No
  • Associated Commits: 7fccf, dd98f

2. [FIX] multi_process_runner_test times out after 2400 seconds on gfx11xx and gfx12xx GPUs: This pull request addresses the issue of the multi_process_runner_test timing out after 2400 seconds on gfx11xx and gfx12xx GPUs by implementing a total timeout of 300 seconds and individual process timeouts of 120 seconds within MultiProcessPoolRunner.run(), running conn.recv() in a separate thread to prevent blocking, and adding more descriptive logging to improve test reliability and debugging.

  • URL: pull/97680
  • Merged: No
  • Associated Commits: 111ff

3. GPU: Fix overflow inconsistency in cumsum on CPU/GPU: This pull request addresses a platform-dependent overflow issue in the TensorFlow Lite GPU delegate's cumsum operation by implementing higher-precision accumulation and modifying GPU shader logic to ensure consistent and overflow-safe results between CPU and GPU executions.

  • URL: pull/97693
  • Merged: No
  • Associated Commits: 858e9

Other Open Pull Requests

  • Thread pool and resource management improvements: These pull requests focus on optimizing resource usage in TensorFlow by refactoring the oneDNN matmul operation to avoid unnecessary thread pool creation for small inputs and implementing a thread-safe mechanism to prevent duplicate registrations of plugin factories. Both changes enhance efficiency and reduce misleading warnings without impacting performance.
  • pull/97701, pull/97724
  • Code quality and bug fixes: These pull requests address various issues in the TensorFlow codebase, including fixing critical Python bugs such as variables used before assignment and constructor argument mismatches, correcting typographical errors in documentation, and improving code robustness with added type annotations and PEP8 formatting. Together, they enhance the clarity, correctness, and maintainability of the project.
  • pull/98001, pull/98108

3.2 Closed Pull Requests

This section provides a summary of pull requests that were closed in the repository over the past week. The top three pull requests with the highest number of commits are highlighted as 'key' pull requests. Other pull requests are grouped based on similar characteristics for easier analysis. Up to 25 pull requests are displayed in this section, while any remaining pull requests beyond this limit are omitted for brevity.

Pull Requests Closed This Week: 9

Key Closed Pull Requests

1. Fix: Prevent Duplicate Plugin Factory Registration (cuDNN/cuFFT/cuBLAS): This pull request proposes a thread-safe fix to prevent duplicate registration of cuDNN, cuFFT, and cuBLAS plugin factories in TensorFlow by introducing atomic flags that ensure each factory is registered only once, thereby eliminating recurring error-level warnings caused by concurrent or repeated registration attempts without affecting functionality or performance.

  • URL: pull/97723
  • Merged: No
  • Associated Commits: 6370f, 2b647, e2767, b012e, 6f8b6, 1f1c5, 643d6, 67d0b, ac72e, 19d8d, 63768, b47f6, 2a001, 1a5c7, d1fe3, a26fe, f0a5f, ca1fc, 40a14, 1e589, 7e9e5, 59e4e, ad9c8, 6a9d5, 6b048, d5b1f, f3d16, ed05e, 1f67a, f4d86, 2eab5, 9a015, 0335d, 734c9, 5480c, d5b11, f4841, 90e09, 8293a, a0f11, c3f6b, 52314, 2498b, 0e1fa, 81650, ffb99, 54735, 8c470, f63c7, 51f55, cf7f2, 71c00, cce00, c5000, 8f2bd, 5edcf, 890b0, ea1cc, 9b5b0, fe721, bd34b, ac7e3, 7f909, 27e1e, 25f1e, 232ec, 75d7e, ee6c5, b4bfd, 4a2e3, 46150, 76df4, 518ca, a0c62, 9e6b0, 0b5a1, e43ae, 380ca, a37ef, 80bfa, 3499d, 975b0, 618ba, 732b0, 8d02e, 23c8b, 7c1a0, 0b3dd, 7229c, 5eb2c, 1961c, 31e00, 823d0, 7070e, 903da, 463d3, c0aea, 691ea, be048, edb05, c212d, 2dd32, 462df, 8b0d4, 3c56a, 7df9f, d812b, ea308, f21ec, 58edf, 397ce, 2f752, 465fe, ad262, 9113b, 3ee65, 82f8d, 1d7ab, 319be, b696f, 8ae69, b9a85, 665e8, 78f93, 2be00, a153f, e6dbc, 092ac, ecd21, 672bb, 4e94b, 353be, 26215, b6c7e, 8f3a0, 22548, 1344c, 9e473, 82e46, 4b60d, 83801, 55c35, daf3d, 804d8, 87fee, 6972c, 5a796, d7b59, 94475, 2df1c, 84683, 6d9f2, ca82a, 62bf5, e1454, 7b2fc, 2cac0, 4567c, 4e985, 69e59, ca62a, c8a09, f5477, 0dae7, 69cef, 93a64, 05c8a, 5fe8d, 95256, 07d09, 31d00, 93e4d, 3dc04, 703fd, db687, 53683, ce2bb, d1d5e, 4c615, c81d9, 84811, e956a, 9f6d1, 05878, a4927, 82c69, d8c90, 4fe26, 664d4, 2fa03, 358cc, 76d4a, db274, b26d6, 2ed90, d81d6, 27ab8, 12780, abfe1, d5b89, 6a3c7, f0c9a, 2b64e, 02e93, 72fdf, 744d3, 1b8a4, fa101, 29828, 0926e, 518cb, 82248, d7c84, 9aca9, 65d98, 044b0, 1110c, 2c1d5, a9bd1, 64e7e, 856c8, f1966, 50808, 8935a, 61ff2, 1ea6d, 903b3, 2e711, 69a6a, 32103, 13072, d4055, bb505, 6f299, 480be, e41b8, d14b9, b95e8, eb41e, 89d28, 492df, 8ffa9, c8b0f, 339bf, 7033d, e174a, 23cf0, 4cfae, 476b3, a785e

2. Deterministic reduce min: This pull request proposes adding a deterministic option to the tf.math.reduce_min function in TensorFlow, which, when enabled, ensures that the gradient is routed exclusively to the first occurrence of the minimum value in cases of ties, rather than splitting the gradient evenly among all tied minimum values, thereby addressing issue #97688.

  • URL: pull/97840
  • Merged: No
  • Associated Commits: 70e89, a7d64, 23d23, fc510, 6ea96, f5f75, b336a, 7e8c3, c7df5, 8475d, 097f1, cc479, aea34

3. 3 Dateein geändert: This pull request proposes updates to three files involving dependency version bumps, specifically upgrading the google/osv-scanner-action GitHub Action from 2.0.3 to 2.1.0 and updating the Ubuntu base images in multiple Docker-related directories, but it was not merged.

  • URL: pull/98129
  • Merged: No
  • Associated Commits: d795b, d8685, 2808f, 2e243, 03983, 6d2e5, 1b8c1

Other Closed Pull Requests

  • Code cleanup and compatibility fixes: Multiple pull requests address code quality and compatibility issues, including enabling explicit conversion from llvm::StringRef to absl::string_view to resolve compatibility problems, and removing unused variables to fix compiler warnings. These changes improve code maintainability and reduce build-time warnings.
  • [pull/97525, pull/97484]
  • Build and release improvements: Some pull requests focus on improving the build and release process, such as fixing the release wheels build by cherry-picking a specific commit and enabling 16KB page size alignment in a GPU library to resolve warnings and maintain ecosystem consistency. These updates ensure smoother builds and better runtime behavior.
  • [pull/97728, pull/96702]
  • Plugin registration safety: One pull request introduces a thread-safe atomic flag to prevent duplicate registration of cuDNN, cuFFT, and cuBLAS plugin factories, eliminating recurring error-level warnings without impacting functionality or performance. This change enhances the robustness of plugin management in TensorFlow.
  • [pull/97720]
  • Spam and inappropriate submission: There is a pull request identified as spam containing a commit referencing a local file path related to artificial intelligence materials, which remains unmerged and likely inappropriate. This highlights the need for careful review of contributions.
  • [pull/97994]

3.3 Pull Request Discussion Insights

This section will analyze the tone and sentiment of discussions within this project's open and closed pull requests that occurred within the past week. It aims to identify potentially heated exchanges and to maintain a constructive project environment.

  1. I'm a spammer
    • Toxicity Score: 0.75 (Rapid escalation, dismissive tone, calls for moderation, unresponsiveness)
    • This GitHub conversation involves a user whose contributions are perceived as spam by others, leading to a generally negative and dismissive tone from multiple participants. The original poster appears unresponsive or indifferent to feedback, which triggers frustration and calls for moderation from other users. The interaction remains focused on managing the disruptive behavior rather than technical discussion, with some users expressing concern about the impact on the community.

IV. Contributors

4.1 Contributors

Active Contributors:

We consider an active contributor in this project to be any contributor who has made at least 1 commit, opened at least 1 issue, created at least 1 pull request, or made more than 2 comments in the last month.

If there are more than 10 active contributors, the list is truncated to the top 10 based on contribution metrics for better clarity.

Contributor Commits Pull Requests Issues Comments
tensorflower-gardener 90 0 0 0
harshithn31 71 0 0 0
CloudSmallInsect 0 0 8 27
mihaimaruseac 0 0 0 28
ezhulenev 20 0 0 0
Venkat6871 1 1 0 13
No author found 11 0 0 0
dhruv-dhiman122 6 1 0 2
gaikwadrahul8 3 2 0 4
jiren-the-gray 0 0 8 0

Don't miss what's next. Subscribe to Weekly Project News:
Powered by Buttondown, the easiest way to start and grow your newsletter.