Weekly GitHub Report for Tensorflow: February 03, 2025 - February 10, 2025
Weekly GitHub Report for Tensorflow
Thank you for subscribing to our weekly newsletter! Each week, we deliver a comprehensive summary of your GitHub project's latest activity right to your inbox, including an overview of your project's issues, pull requests, contributors, and commit activity.
Table of Contents
I. News
1.1 Recent Version Releases:
The current version of this repository is v2.18.0
1.2 Version Information:
The TensorFlow 2.18.0 release, created on October 21, 2024, introduces several key updates, including the addition of a fourth parameter to the TfLiteOperatorCreate
function for a cleaner API, the disabling of TensorRT support in CUDA builds, and the implementation of Hermetic CUDA for more reproducible builds. Notable improvements include default support for NumPy 2.0, enhancements in tf.lite
such as support for TensorType_INT4
and TensorType_INT16
, and new features in tf.data
for improved memory and throughput management.
II. Issues
2.1 Top 5 Active Issues:
We consider active issues to be issues that that have been commented on most frequently within the last week. Bot comments are omitted.
-
DRQ (Dynamic Range Quantization) - which ops are affected?: This issue is about a user seeking clarification on which operations are affected by Dynamic Range Quantization (DRQ) when applied to transformer models, specifically questioning whether computations are performed using int8 precision for both weights and activations. The user is also interested in understanding the impact of DRQ on fully connected layers within transformers and seeks guidance on how to verify these changes using available tools and documentation.
- The comments provide detailed explanations about how DRQ affects fully connected layers in transformers, confirming that computations occur in int8 precision while activations are stored in float32. The discussion includes suggestions for tools to visualize model changes and references to specific TensorFlow Lite source code files for further exploration. The user expresses gratitude for the detailed responses and seeks additional guidance on locating relevant code within the TensorFlow Lite framework.
- Number of comments this week: 6
-
Stateful LSTM bug with batch size: This issue pertains to a bug in TensorFlow 2.18 where the
batch_input_shape
argument is not recognized as valid when implementing a stateful LSTM model with a fixed batch size, causing errors similar to a previously reported issue. Additionally, themodel.reset_states()
function appears to be malfunctioning, and the user provides code to reproduce the problem.- The comments discuss the validity of the
batch_input_shape
parameter in TensorFlow 2.18, suggesting that it should still work if batch sizes are consistent. A user reports encountering the same error with the provided code, and another suggests using an alternative Keras version to resolve compatibility issues. The conversation highlights confusion due to different Keras versions in TensorFlow 2.18 and 2.13, and the need for further clarification from the Keras team. - Number of comments this week: 5
- The comments discuss the validity of the
-
inconsistent result of
tf.raw_ops.BiasAddGrad
on CPU and GPU: This issue reports a bug in TensorFlow where thetf.raw_ops.BiasAddGrad
operation produces inconsistent results when executed on CPU versus GPU, specifically with the TensorFlow version 2.18 on a Linux Ubuntu 22.04 system using a Tesla T4 GPU. The problem is demonstrated with a code snippet that shows a discrepancy in the output tensors' values and a failed consistency check between the two processing units.- The comments discuss the potential cause of the issue, suggesting it might be specific to GPU due to precision errors in calculations, and note that using float64 precision resolves the problem. There is a reference to a similar issue and a suggestion to close duplicate issues for better tracking. Additionally, there is a discussion about the acceptable tolerance thresholds for the data type used and a clarification that the core issue might differ from a similar reported issue.
- Number of comments this week: 4
-
User Guide: Deprecated Nvidia Docker Link: This issue highlights a documentation bug in the TensorFlow project, where the user guide incorrectly links to the now-archived Nvidia Docker repository instead of the active Nvidia Container Toolkit repository. The problem requires the TensorFlow documentation team to update the outdated link to ensure users are directed to the correct resources for enabling GPU support on Linux.
- The comments discuss the incorrect link in the TensorFlow documentation, with suggestions to update it to the new Nvidia Container Toolkit repository. A user provides the correct link and suggests transitioning to the new setup. Another user thanks the reporter and confirms that an internal request has been made to update the documentation.
- Number of comments this week: 4
-
Wheels have different metadata on different platforms: This issue highlights a problem with the metadata of TensorFlow wheels being inconsistent across different platforms, which affects the ability of Python resolvers like poetry and uv to create universal lockfiles that work on any platform. The inconsistency arises because the
METADATA
file for TensorFlow wheels differs between Windows and Unix-based systems, leading to different lockfiles depending on the platform from which the metadata is read.- The comments discuss the responsibility of maintaining certain TensorFlow packages and suggest trying the latest version to see if the issue persists. The original poster clarifies their request for consistent metadata across platforms, while another commenter argues that the issue lies with poetry and uv, not TensorFlow. The discussion also touches on the lack of PEP specifications regarding this behavior and the performance implications of the current setup.
- Number of comments this week: 3
2.2 Top 5 Stale Issues:
We consider stale issues to be issues that has had no activity within the last 30 days. The team should work together to get these issues resolved and closed as soon as possible.
As of our latest update, there are no stale issues for the project this week.
2.3 Open Issues
This section lists, groups, and then summarizes issues that were created within the last week in the repository.
Issues Opened This Week: 5
Summarized Issues:
- Inconsistent Results Between CPU and GPU in TensorFlow 2.18: Several issues have been reported regarding TensorFlow 2.18 where operations like
tf.raw_ops.LogSoftmax
,tf.raw_ops.Tan
, andtf.raw_ops.Rsqrt
produce inconsistent results between CPU and GPU executions. These inconsistencies are demonstrated by code snippets showing significant differences in output values and failed consistency checks, highlighting potential problems in cross-platform computation accuracy.
- Unexpected Import Path in TensorFlow's mypy-protobuf: An issue has been identified with mypy-protobuf generating unexpected import paths during the creation of stub files for TensorFlow's .proto files. The expected import path is incorrectly generated, leading to confusion about Bazel configuration settings and whether manual adjustments are necessary to resolve the import path discrepancies.
- Model Training Freeze on macOS with Multi-GPU Strategy in TensorFlow 2.13: A bug in TensorFlow 2.13 causes the process of training a model using multiple GPUs with
tf.distribute.MultiWorkerMirroredStrategy
on macOS systems to freeze without any error messages. This issue persists even when using TensorFlow Nightly, despite following the official documentation, indicating a potential problem with the strategy's compatibility on macOS.
2.4 Closed Issues
This section lists, groups, and then summarizes issues that were closed within the last week in the repository. This section also links the associated pull requests if applicable.
Issues Closed This Week: 19
Summarized Issues:
- TensorFlow Lite Compilation Issues: The TensorFlow Lite library faced a compilation problem due to a regular expression in the build configuration that incorrectly matched directory names containing "test_.*". This caused certain files not to be compiled, leading to linking issues, which were resolved by updating the regular expression for strict filename matching.
- TensorFlow Custom Layer Output Shape Bug: A bug in TensorFlow's
model.summary()
function failed to display the correct output shape for custom layers likeDenseQKan
andRescale
. The output shape was incorrectly shown asNone
instead of[(None, 10)]
, and the issue was resolved by reshaping the output tensor within the custom layer'scall
method.
- TensorFlow Lite Interpreter Initialization Error: Users encountered a
java.lang.UnsatisfiedLinkError
when initializing a TensorFlow Lite interpreter on a Samsung S20 device using version 2.15.0. This error was due to the failure to load the native TensorFlow Lite methods, specifically the "libtensorflowlite_jni_gms_client.so" library, indicating missing or improperly loaded native libraries.
- TensorFlow Compatibility and Installation Issues: Users faced compatibility issues with TensorFlow on Windows 10 using Python 3.10, particularly due to the lack of support for Python 3.12, resulting in errors during model training. Additionally, installation difficulties were reported on macOS using Poetry, where TensorFlow version 2.18.0 could not find installation candidates, and on Windows 11 Pro, where the absence of Visual C++ build tools caused a build error.
- TensorFlow Convolution and Pooling Function Bugs: Several bugs were reported in TensorFlow's convolution and pooling functions, including
tf.nn.conv3d_transpose
,tensorflow.keras.backend.conv2d
, andtensorflow.nn.max_pool1d
, where specific inputs caused crashes due to invalid argument errors or deprecated API usage. These issues were often related to stride and kernel size parameters that exceeded acceptable ranges.
- TensorFlow Operation and Argument Errors: Bugs in TensorFlow operations like
tensorflow.raw_ops.RecordInput
andScatterNdNonAliasingAdd
caused crashes due to invalid argument errors, such as negative batch sizes or mismatched dimensions. These issues were reported by users running custom code on Ubuntu 20.04 with TensorFlow versions 2.16.1 and 2.17.0.
- TensorFlow Documentation and Feature Requests: Users requested documentation improvements for TensorFlow, particularly regarding the installation of
tensorflow[and-cuda]
on Linux Fedora 41 with Python 3.12, where theLD_LIBRARY_PATH
was not automatically configured to include necessary NVIDIA shared libraries. A workaround was suggested to dynamically set theLD_LIBRARY_PATH
to resolve the issue.
- TensorFlow Compilation Guidance: Users sought guidance on compiling TensorFlow on Debian 12 with specific compiler flags to enable AVX2, AVX512F, and FMA instructions. They encountered messages suggesting a rebuild with appropriate flags when using Jupyter, indicating a need for proper compilation instructions.
- TensorFlow Hub KerasLayer Integration Error: A ValueError occurred when adding a TensorFlow Hub KerasLayer to a
tf.keras.Sequential
model, with an error message incorrectly stating that only instances ofkeras.Layer
can be added. The issue was resolved by using a compatible version of Keras (tf-keras) with the Sequential model.
- TensorFlow Python Version Compatibility Concerns: Users expressed concerns about TensorFlow version 2.18.0's lack of compatibility with Python versions 3.12 and 3.13. They questioned why TensorFlow had not been updated to support these newer Python versions despite Python 3.12 being available for over a year.
2.5 Issue Discussion Insights
This section will analyze the tone and sentiment of discussions within this project's open and closed issues that occurred within the past week. It aims to identify potentially heated exchanges and to maintain a constructive project environment.
Based on our analysis, there are no instances of toxic discussions in the project's open or closed issues from the past week.
III. Pull Requests
3.1 Open Pull Requests
This section provides a summary of pull requests that were opened in the repository over the past week. The top three pull requests with the highest number of commits are highlighted as 'key' pull requests. All other pull requests are grouped based on similar characteristics for easier analysis.
Pull Requests Opened This Week: 6
Key Open Pull Requests
1. Qualcomm AI Engine Direct - Support DUS and Pack Op in LiteRT: This pull request introduces support for the DUS and Pack operations in LiteRT for the Qualcomm AI Engine Direct, with partial support for specific DUS cases and the use of QNN Concat for Pack operations to address type support issues, while also depending on another pull request (https://github.com/tensorflow/tensorflow/pull/85477) and including changes primarily in the latest four commits.
- URL: pull/86630
- Merged: No
- Associated Commits: 4833a, 725f5, dd3f2, 06e06, c726a, 17b9a, 813ad, 508ac, 1faad, d349e, 6bd5f, 7f2f2, 03109, 10d04
2. nvrtc-builtins are private don't link to them: This pull request addresses CUDA compatibility issues by proposing a patch to avoid linking to the private lib/libnvrtc-builtins.so.12.6.85
library, which is not compatible across different CUDA versions, as discussed in a related comment on the conda-forge/tensorflow-feedstock repository.
- URL: pull/86413
- Merged: No
3. [ROCm] Enable unsafe fp atomics and cleanup gpu_device_functions.h: This pull request aims to enable unsafe floating-point atomics and clean up the gpu_device_functions.h
file in the TensorFlow project, as indicated by the commit message and the associated changes.
- URL: pull/86704
- Merged: No
- Associated Commits: d8114
Other Open Pull Requests
- Qualcomm AI Engine Direct table updates: This topic involves validating existing system-on-chip (SoC) information and adding new SoC details to the Qualcomm AI Engine Direct table in the TensorFlow project. The pull request aims to ensure the table is up-to-date with the latest SoC information, enhancing the accuracy and utility of the table for developers.
- Documentation link fixes: This topic addresses the issue of broken documentation links in the
play_services.md
file of the TensorFlow project. The pull request updates three non-functional links to ensure users can access the correct documentation, improving the overall user experience.
- Performance enhancement for GNN models: This topic focuses on enhancing performance by preventing unnecessary copies of large constants across multiple clusters. The pull request particularly benefits GNN models with substantial constant data from graph embeddings and addresses issues from a previous pull request.
3.2 Closed Pull Requests
This section provides a summary of pull requests that were closed in the repository over the past week. The top three pull requests with the highest number of commits are highlighted as 'key' pull requests. All other pull requests are grouped based on similar characteristics for easier analysis.
Pull Requests Closed This Week: 15
Key Closed Pull Requests
1. Gfx950 platform arch support: This pull request, titled "Gfx950 platform arch support," aims to add support for the AMD GPU gfx950 platform to the TensorFlow project, as indicated by the body of the pull request, although it was not merged.
- URL: pull/86675
- Merged: No
- Associated Commits: cc4a2, b3fcb, b1a42, a3440, 82c8c, c3a1f, 50990, bfe7f, a8220, bca3b, 70706, 1cfeb, 9c905, 50a78, 16101, aae89, ab856, 74698, 60a2d, 18778, 60fc5, fa3a0, abaaa, 184dc, b0af0, c4679, 6d4a5, dc439, f8a8e, 285cc, 096d7, e9be6, 5d5d0, 6a78b, 075b5, 60b8a, 7b52e, 8b7fc, 150bb, f1d1a, 7219d, 426c0, d88d5, 1c607, 0d724, 90403, b7c50, b00c7, 78a7e, 5c652, b488d, ba815, 2dba9, 34d80, ea53c, 40e3e, f65e5, 5a71c, 83713, 70b4d, abb24, 642ae, b6821, 15640, c4958, b6b66, 6d19b, 4c853, d1038, 54e1f, d8198, d860d, c0b36, 18d5c, 7946a, 24c78, 0c992, 2bab5, 89810, 4cc12, 6aa74, 63752, 57188, dc9fb, a91e4, 75ee5, 1e3c6, d40d9, 31d9c, ec39e, 94d31, f42a7, 94f61, 71a8f, d2e43, 1e3ef, ef2d4, 5bad4, 8f1cb, 37d76, e8844, 4556d, cf094, a2036, 33542, 4717f, b9bf3, 3ed87, cd4b0, f40da, a26fb, 038a6, 52994, 79309, 84c13, 4396a, 3c61e, 06702, 6e9f2, da5df, ff760, de1e4, b4e5e, ce1b2, 874c4, 451d1, 8fc12, c1526, 5dbad, 5c1aa, f7991, d377c, 7c927, f5f98, b5400, 1537a, cece9, 7ae16, e624e, 09335, 2211b, 2a929, 6c915, 07cab, 4d444, 575fd, 67b5d, a2bad, 468cf, 7ccb1, 58fc0, db5ae, 9e21f, da24c, 3515d, 8cf48, 09b52, ccced, a382c, 532df, b854f, 58d80, f4770, 3c370, e5dd7, 9f051, 1afb6, f854d, 554f8, 68c43, 2baac, 33a41, e554c, 1c5d7, 4d453, d7f29, 4ca78, 3ac1e, ed3e4, 5788e, 75503, 04c3e, e2acb, 188d1, 1d388, 87930, c1375, 66ca7, b0d33, 713b6, 09053, b6971, d9163, 76f24, 252a9, 7ec4c, 2464a, 8aedd, 071ee, 3a90e, 67d8f, ac4cb, 02686, 25619, e60b7, 70057, fd54d, b74d5, 72a9f, d71ec, f22b6, 8df41, 22772, b1e81, 34e1c, b45da, f8452, 03daf, 68e1e, e1d97, b3932, a5407, 28861, 79765, 467f7, cf3c4, 0e049, aeabf, 4c21b, 03b3d, 62c0d, b664f, 9019a, 496f7, d5f0c, 16c82, ae453, 9de27, eec13, 15ab9, 4bc26, a3f8d, 34a99, 89a93, 7b820, a80fd, 656ad, 5063b, 21b23, 1d17c
2. Replacement PR for #76210 Add support for quint8 type for uniform_quantize and uniform_ dequantize ops: This pull request, which successfully merged into the TensorFlow project, serves as a replacement for a previous pull request (#76210) and introduces support for the quint8 data type in the uniform_quantize and uniform_dequantize operations, along with several minor fixes and code adjustments.
- URL: pull/84497
- Merged: Yes
3. [NFC] Fix some minor typos.: This pull request addresses minor typographical errors in the TensorFlow project, as indicated by its title, and includes several commits aimed at fixing typos, making code compliant with pylint, adjusting keyword arguments, and reverting unrelated changes, ultimately resulting in a successful merge.
- URL: pull/85711
- Merged: Yes
Other Closed Pull Requests
- Optimization of
astype()
function in TensorFlow: This pull request modifies theastype()
function to prevent unnecessary data copying by propagatingcopy=None
instead of the defaultcopy=True
. It optimizes performance while still allowing copies when explicitly requested, and includes related unit tests and build file updates.
- Updates to align with TOSA v1.0 specification: The pull request updates TensorFlow to align with the TOSA v1.0 specification by adding NaN propagation mode support. It also changes the shift of the MUL operation to a tensor type and modifies the start and size of the slice operation to the TOSA shape type.
- Documentation enhancements in TensorFlow: This pull request updates the TensorFlow repository's documentation by adding a 'Common Issues and Troubleshooting' section to the README file. It also includes a 'QUICK START GUIDE' section to the CONTRIBUTING file, aimed at enhancing readability and maintainability for future contributors.
- Implementation of nested namespace in TensorFlow: The pull request involves the implementation of a nested namespace within the TensorFlow project. The changes were successfully merged into the main codebase, as indicated by the commit messages.
- Proposed addition of "stablehlo_case" operation: This pull request proposes the addition of a new operation called "stablehlo_case" along with its corresponding unit test to the TensorFlow project. However, it was not merged into the main codebase.
- Update to
execute.cc
file: This pull request involves an update to theexecute.cc
file in the TensorFlow project, associated with issue #58676. Despite the proposed changes, it was not merged into the main codebase.
- Correction of broken hyperlink in
create.md
: This pull request addresses the issue of a broken hyperlink in thecreate.md
file by updating the link for the TensorFlow Lite Model Maker for text classification. The updated link has been successfully merged into the project.
- "Partition sample" pull request: This pull request, titled "partition sample," was submitted to the TensorFlow project on GitHub. It includes a single commit with the message "partition sample," but it was not merged into the main codebase.
- Validation and addition of SoC details: This pull request aims to validate existing system-on-chip (SoC) information and add new SoC details to the Qualcomm AI Engine Direct table in the TensorFlow project. The changes are intended to enhance the accuracy and comprehensiveness of the SoC data.
- Correction of typographical errors in documentation: This pull request addresses and corrects several typographical errors in the documentation strings of the TensorFlow project. The corrections have been successfully merged into the main codebase, as indicated by the commit with SHA 80bc3cfb868b7778217c4c1155a1d6039f2c9bd8.
- Introduction of basic wrappers for QNN types: This pull request introduces basic wrappers for QNN types to handle dynamic resources and make them independent of LiteRT/tflite. It includes specific wrappers for scalar parameters, tensor parameters, quantization parameters, tensors, and operation configurations.
- Support for Qualcomm Op Builders in LiteRT: The pull request introduces support for Qualcomm Op Builders in LiteRT by implementing operation wrappers and utilizing a
TensorPool
for managing intermediate tensors. It ensures the builders are independent of LiteRT/tflite while supporting a wide range of operations such as Add, Mul, and Softmax, among others.
3.3 Pull Request Discussion Insights
This section will analyze the tone and sentiment of discussions within this project's open and closed pull requests that occurred within the past week. It aims to identify potentially heated exchanges and to maintain a constructive project environment.
-
- Toxicity Score: 0.55 (Defensive responses, unresolved issues, critical tone)
- This GitHub conversation involves username1 expressing dissatisfaction with the progress of a pull request, while username2 responds defensively, leading to a tense exchange. Username3 attempts to mediate by suggesting a compromise, but username1 remains unconvinced, maintaining a critical tone. The conversation is marked by a lack of resolution and increasing frustration from username1.
-
Qualcomm AI Engine Direct - Validate and update soc table
- Toxicity Score: 0.55 (Defensive responses, unresolved issues, escalating frustration.)
- This GitHub conversation involves username1 expressing concern over the accuracy of the information provided, while username2 responds defensively, leading to a tense exchange. Username3 attempts to mediate by suggesting a compromise, but username1 remains unsatisfied, escalating the tension. The tone shifts from collaborative to confrontational, with username1's frustration being a key trigger.
IV. Contributors
4.1 Contributors
Active Contributors:
We consider an active contributor in this project to be any contributor who has made at least 1 commit, opened at least 1 issue, created at least 1 pull request, or made more than 2 comments in the last month.
If there are more than 10 active contributors, the list is truncated to the top 10 based on contribution metrics for better clarity.
Contributor | Commits | Pull Requests | Issues | Comments |
---|---|---|---|---|
mihaimaruseac | 14 | 1 | 0 | 66 |
Venkat6871 | 3 | 3 | 0 | 31 |
tilakrayal | 4 | 3 | 0 | 26 |
gaikwadrahul8 | 2 | 2 | 0 | 26 |
weilhuan-quic | 21 | 4 | 0 | 0 |
arzoo0511 | 0 | 0 | 0 | 14 |
LongZE666 | 0 | 0 | 12 | 1 |
c8ef | 6 | 1 | 0 | 4 |
alekstheod | 9 | 1 | 0 | 0 |
i-chaochen | 10 | 0 | 0 | 0 |