Weekly GitHub Report for Tensorflow - 2024-12-02 12:00:41
Weekly GitHub Report for Tensorflow
Thank you for subscribing to our weekly newsletter! Each week, we deliver a comprehensive summary of your GitHub project's latest activity right to your inbox, including an overview of your project's issues, pull requests, contributors, and commit activity.
Table of Contents
I. News
1.1 Recent Version Releases:
The current version of this repository is v2.18.0
1.2 Other Noteworthy Updates:
II. Issues
2.1 Top 5 Active Issues:
We consider active issues to be issues that that have been commented on most frequently within the last week.
-
Results on PC and on Android are very different: This issue describes a problem where the user is experiencing significantly different results when running a classification model, EfficientNetB0, on a PC compared to an Android device, despite using the same image and model. The user is seeking assistance to understand if there is an error in their implementation, particularly in the way images are processed and predictions are made on Android.
- The comments section involves a discussion where the user is asked to provide the TensorFlow version used, and they share the versions for both PC and Android. The user is requested to provide a minimal reproducible example, and they share a GitHub repository link and model files. There is a back-and-forth about the operating system used and attempts to replicate the issue, with suggestions to check image processing steps. The conversation includes troubleshooting steps, such as verifying image cropping and sharing additional code snippets, but the issue remains unresolved as the discrepancy persists.
- Number of comments this week: None
-
tf.gather and workarouds are very slow on TPU: This issue highlights a performance problem with the
tf.gather
function when used on TPUs, which is significantly slower compared to its performance on GPUs. The user is seeking a solution to improve the efficiency of thetf.one_hot
andtf.einsum
workaround, which could greatly enhance the pretraining of DeBERTa models on TPUs.- The comments discuss the need for a reproducible code to better understand the issue, with the user providing several code snippets to demonstrate the performance differences between GPUs and TPUs. The conversation includes detailed timing results for different versions of the code, showing that the TPU takes significantly longer to execute the same operations compared to the GPU. The discussion also explores different implementations, such as using
tf.one_hot
withtf.einsum
andtf.gather
, to identify the most efficient approach for TPUs. - Number of comments this week: None
- The comments discuss the need for a reproducible code to better understand the issue, with the user providing several code snippets to demonstrate the performance differences between GPUs and TPUs. The conversation includes detailed timing results for different versions of the code, showing that the TPU takes significantly longer to execute the same operations compared to the GPU. The discussion also explores different implementations, such as using
-
This method creates a model with a 100% memory leak loop using model. fit(): This issue reports a memory leak problem when using the
model.fit()
method in TensorFlow version 2.18, which persists despite attempts to clear memory, as evidenced by the increasing memory usage over time. The problem is identified as a bug in TensorFlow, where resources created by TensorFlow functions are not being garbage collected, leading to memory accumulation.- The comments discuss the memory leak issue, with one user confirming the problem on their setup and suggesting it be reported to the Keras repository. Another user identifies the source of the leak as a TensorFlow bug, noting that similar issues do not occur with JAX or PyTorch, and provides links to related discussions and responses from the Keras team, indicating the issue is indeed with TensorFlow.
- Number of comments this week: None
-
Very serious! Using this method will definitely result in memory leaks, I hope you can provide support: This issue reports a memory leak problem when using a specific method in TensorFlow version 2.18, which persists despite attempts to clear memory, as evidenced by a consistent upward trend in memory usage over time. The user has provided a detailed code snippet to reproduce the issue, indicating that the problem occurs on both Ubuntu 2.2 and Mac M1 platforms with Python 3.11.
- The comments discuss a user's interest in contributing to the issue, with guidance provided on how to submit a pull request for TensorFlow. Another comment mentions feedback from the Keras team, confirming that the issue has been long-standing and originates from TensorFlow, with references to related issues in Keras repositories.
- Number of comments this week: None
-
TF_SelectV2Op gets legalized to TFL_SelectOp: This issue involves the incompatibility of TensorFlow's
tf.where
function when converted to TFLite Micro, as it does not support the older SELECT operation, only the newer SELECT_V2 operation. The problem arises from a change made five years ago to improve compatibility with older runtimes, which now disrupts compatibility with TFLite Micro, leading to a need for resolution.- The comments discuss the need for someone to look into the issue, with a suggestion to involve a specific developer. The issue is considered more relevant to AI-Edge-Torch, and it is suggested to move the discussion there. The original poster expresses concern about the potential need to migrate from TensorFlow to PyTorch, as their current project relies on TensorFlow.
- Number of comments this week: None
2.2 Top 5 Stale Issues:
We consider stale issues to be issues that has had no activity within the last 30 days. The team should work together to get these issues resolved and closed as soon as possible.
As of our latest update, there are no stale issues for the project this week.
2.3 Open Issues
This section lists, groups, and then summarizes issues that were created within the last week in the repository.
Issues Opened This Week: 144
Summarized Issues:
- Documentation Bugs: The documentation for TensorFlow contains errors that need addressing. One issue involves a non-functional contact email address, leading to error messages when used. Another issue highlights outdated documentation for
tfl.reverse_v2
, which inaccurately states support for only a single axis, while the kernel implementation supports multiple axes.
- TensorFlow Operation Bugs: Several bugs have been identified in TensorFlow operations across different versions. The
ParameterizedTruncatedNormal
operation causes crashes on specific Linux setups, while theUnBatch
operation also leads to crashes when used with a scalarbatch_index
. Additionally, theRaggedBincount
operation suffers from integer overflow errors, and theSparseMatrixSparseCholesky
operation has a heap-buffer-overflow bug.
- Memory Leaks in TensorFlow: Memory leaks have been reported in TensorFlow, particularly in version 2.18. The
model.fit()
method causes memory usage to increase over time, even with attempts to clear sessions and collect garbage. This issue is linked to the improper garbage collection of resources created by TF functions.
- TensorFlow Lite Conversion Issues: Users have encountered issues when converting TensorFlow models to TensorFlow Lite. Problems include execution failures due to unsupported operations and discrepancies in model performance post-conversion. These issues suggest potential mismatches between TensorFlow and TensorFlow Lite versions or conversion process errors.
- Mathematical Inconsistencies in TensorFlow: TensorFlow version 2.17.1 has been reported to produce inconsistent results in mathematical operations. The
tf.math.log1p
function shows discrepancies with NumPy'snp.log1p
when handling complex inputs with infinity. Additionally, operations likesin
,cos
, andexp
yield different results on CPU and GPU for complex numbers containinginf
.
- Compilation and Build Errors: Compilation errors have been reported in TensorFlow projects due to various reasons. These include issues with Microsoft Visual C++ and the
nsync
library, as well as problems with TensorFlow Lite delegates when using specific compilers. Additionally, building TensorFlow from source on Windows can fail due to path discrepancies.
- Privacy Breaches and Social Media Backlash: Multiple issues involve privacy breaches of Pakistani TikTokers, whose explicit videos were leaked and went viral. This led to intense trolling and the deactivation of their social media accounts, raising significant concerns about the privacy and security of social media influencers.
- issues/tensorflow/tensorflow/issues/81539
- issues/tensorflow/tensorflow/issues/81566
- issues/tensorflow/tensorflow/issues/81571
- issues/tensorflow/tensorflow/issues/81576
- issues/tensorflow/tensorflow/issues/81580
- issues/tensorflow/tensorflow/issues/81588
- issues/tensorflow/tensorflow/issues/81609
- issues/tensorflow/tensorflow/issues/81614
- issues/tensorflow/tensorflow/issues/81619
- issues/tensorflow/tensorflow/issues/81625
- issues/tensorflow/tensorflow/issues/81630
2.4 Closed Issues
This section lists, groups, and then summarizes issues that were closed within the last week in the repository. This section also links the associated pull requests if applicable.
Issues Closed This Week: 116
Summarized Issues:
- TensorFlow Lite Conversion Issues: Several issues have been reported regarding the conversion of models to TensorFlow Lite (TFLite), highlighting challenges such as non-converted operations, discrepancies in quantization processes, and compatibility with specific operations. Users have encountered problems with model conversion, including failures during the calibration step, runtime errors due to non-broadcastable shapes, and the need for selective builds to reduce binary size. These issues emphasize the need for improved conversion processes and support for various operations to ensure efficient and accurate model deployment.
- TensorFlow Lite Inference Performance: Users have reported significant performance discrepancies when running TensorFlow Lite models on different platforms, particularly between Android and iOS devices. Issues include slower inference times on Android compared to iOS, unexpected crashes during inference, and inefficiencies in GPU delegation policies. These performance challenges highlight the need for optimization and consistent performance across different devices and platforms.
- TensorFlow Lite Build and Compatibility Issues: Various issues have been reported related to building TensorFlow Lite on different platforms, including Windows, macOS, and Android. Users have encountered build failures due to missing symbols, linker errors, and compatibility with specific architectures or toolchains. These challenges underscore the need for comprehensive build documentation and support for diverse development environments.
- TensorFlow Lite Feature Requests: Users have submitted feature requests to enhance TensorFlow Lite's capabilities, such as supporting new data types, improving delegate support, and enabling dynamic shapes. These requests aim to expand TensorFlow Lite's functionality to better accommodate diverse use cases and improve model performance and compatibility across different platforms.
- TensorFlow Lite Runtime Errors and Crashes: Several issues have been reported regarding runtime errors and crashes when using TensorFlow Lite, often related to specific operations or delegates. Users have experienced segmentation faults, null pointer dereferences, and memory leaks, which can hinder model deployment and require debugging and resolution to ensure stable and reliable inference.