Weekly GitHub Report for Pytorch: April 27, 2026 - May 04, 2026 (14:32:38)
Weekly GitHub Report for Pytorch
Thank you for subscribing to our weekly newsletter! Each week, we deliver a comprehensive summary of your GitHub project's latest activity right to your inbox, including an overview of your project's issues, pull requests, contributors, and commit activity.
Table of Contents
I. News
1.1 Recent Version Releases:
The current version of this repository is v2.6.0
1.2 Version Information:
Released on January 29, 2025, PyTorch 2.6 introduces significant enhancements including torch.compile support for Python 3.13, a new dynamic compilation control API torch.compiler.set_stance, and improved AOTInductor packaging and ABI compatibility. Notable highlights also include FP16 support on x86 CPUs, expanded Intel GPU support, FlexAttention for x86 CPUs targeting LLMs, and a backward-incompatible security improvement flipping the default of torch.load to weights_only=True, alongside the deprecation of official Conda package publishing.
II. Issues
2.1 Top 5 Active Issues:
We consider active issues to be issues that that have been commented on most frequently within the last week. Bot comments are omitted.
As of our latest update, there are no active issues with ongoing comments this week.
2.2 Top 5 Stale Issues:
We consider stale issues to be issues that has had no activity within the last 30 days. The team should work together to get these issues resolved and closed as soon as possible.
As of our latest update, there are no stale issues for the project this week.
2.3 Open Issues
This section lists, groups, and then summarizes issues that were created within the last week in the repository.
Issues Opened This Week: 0
Summarized Issues:
As of our latest update, there are no open issues for the project this week.
2.4 Closed Issues
This section lists, groups, and then summarizes issues that were closed within the last week in the repository. This section also links the associated pull requests if applicable.
Issues Closed This Week: 2
Summarized Issues:
- ROCm CI Test Failures and Instability: The ROCm trunk distributed tests are timing out due to issues with rocshmem tests, which has led to their temporary disabling while a fix is being developed. Additionally, ROCm trunk CI jobs have become unstable because test failures were initially masked by a Kineto submodule update, but subsequent updates re-exposed these failures, causing ROCm jobs to be marked as unstable as a fix is pursued.
- [issues/178884, issues/179911]
2.5 Issue Discussion Insights
This section will analyze the tone and sentiment of discussions within this project's open and closed issues that occurred within the past week. It aims to identify potentially heated exchanges and to maintain a constructive project environment.
Based on our analysis, there are no instances of toxic discussions in the project's open or closed issues from the past week.
III. Pull Requests
3.1 Open Pull Requests
This section provides a summary of pull requests that were opened in the repository over the past week. The top three pull requests with the highest number of commits are highlighted as 'key' pull requests. Other pull requests are grouped based on similar characteristics for easier analysis. Up to 25 pull requests are displayed in this section, while any remaining pull requests beyond this limit are omitted for brevity.
Pull Requests Opened This Week: 0
As of our latest update, there are no open pull requests for the project this week.
3.2 Closed Pull Requests
This section provides a summary of pull requests that were closed in the repository over the past week. The top three pull requests with the highest number of commits are highlighted as 'key' pull requests. Other pull requests are grouped based on similar characteristics for easier analysis. Up to 25 pull requests are displayed in this section, while any remaining pull requests beyond this limit are omitted for brevity.
Pull Requests Closed This Week: 16
Key Closed Pull Requests
1. [overlap] pre-bucketing of fsdp collectives: This pull request introduces a pre-bucketing strategy for Fully Sharded Data Parallel (FSDP) collectives in the overlap scheduling algorithm to improve bucketing efficiency by calibrating bucket sizes based on bandwidth and latency, enabling reliable detection of FSDP collectives even with irregular patterns from autoparallel shardings, and includes configurable parameters and thorough testing to ensure correctness.
- URL: pull/179935
2. [ROCm] - Reduce generated CK kernel files and build by default: This pull request updates the ROCm build configuration to enable the CK kernel build by default while implementing various filters to reduce the number of generated CK kernel files, optimizing the build process.
- URL: pull/178310
3. torch.backends.fp32_precision setter propagate to cudnn.conv/rnn: This pull request addresses the issue where the torch.backends.fp32_precision setter did not propagate to cudnn.conv and cudnn.rnn modules by implementing a default handling mechanism, adding try-except blocks to prevent runtime errors, and providing a workaround suggestion.
- URL: pull/179750
Other Closed Pull Requests
- Cache isolation in PyTorch Dynamo: This pull request introduces a
region_idflag to cache entries in PyTorch's Dynamo compiler, enabling multipletorch.compile()calls targeting the same function to maintain separate, region-specific caches. This prevents cross-region interference in cache lookups, recompile limits, and execution strategies while still allowing shared profile-guided optimizations across regions.
- Iterator protocol implementation: A
generic_iternextfunction is added toobject_protocol.pyimplementing CPython'sPyIter_Nextsemantics, including checks for thetp_iternextslot and dispatching toiternext_implon VariableTracker subclasses. This provides a standardized iterator next method and raises aTypeErrorif the object is not an iterator.
- Pipeline RECV deferral on AMD ROCm: This pull request adds a configurable flag to defer pipeline RECV operations on platforms like AMD ROCm, postponing RECVs until just before the compute operations that consume their data. This eliminates pipeline bubbles and avoids deadlocks by using a rank-parity peer-to-peer ordering strategy.
- Distributed and inductor test configurations for ROCm CI: Distributed tests with 3 shards on 4-GPU runners and inductor tests with 2 shards on single-GPU runners are added to the rocm-nightly continuous integration workflow. This setup mirrors existing periodic workflows to improve test coverage on ROCm platforms.
- Handling stale CUDA streams in autograd during CUDA graph capture: The pull request introduces a default behavior that raises a clear RuntimeError when autograd nodes hold stale references to the default CUDA stream during graph capture. It also adds an opt-in override flag that redirects stale non-capturing streams to the producer’s capturing stream to prevent crashes and enable successful graph capture.
- Source file mirroring moved from setup.py to CMake: Source file mirroring is moved from the setup.py script to CMake by copying source files into specific directories included in the wheel. This removes the previous mirroring logic from setup.py and handles file mirroring through cmake/FileMirroring.cmake for both scikit-build-core and setuptools builds.
- Package data installation support for scikit-build-core migration: A new cmake/PackageData.cmake file is added to support installing non-Python package data into the wheel, replacing setup.py's package_data configuration. It also introduces a fallback for the unused SKBUILD_PLATLIB_DIR to ensure the module remains self-contained.
- Fixes for PyTorch Inductor compiler CUDA issues: Two bugs are fixed: a device string comparison error causing the
e8m0_rceil_log2pattern to fail on CUDA devices, and incorrectuint8outputs fromceil(log2(...))pipelines on pre-SM100 GPUs. The faulty PTX instruction is replaced with an exact IEEE 754 bit-manipulation method to ensure correct behavior across all CUDA hardware.
- Pluggable MaterializeFn hook in StorageImpl: A pluggable
MaterializeFnfunction pointer hook is introduced to replace the hard-coded copy-on-write materialization logic inStorageImpl. This enables any backend to intercept write-path data pointer access in a general and extensible way while preserving existing copy-on-write semantics and improving modularity.
- Enhanced PyObject_GetItem dispatch in vt_getitem: The second branch of CPython's PyObject_GetItem dispatch is implemented by adding a
sq_itembranch and replacing the Python-levelhasattr(key_type, "__index__")check with a more efficient C-level slot detection usinghas_slot(NB_INDEX). This enables support for types that implementsq_itemwithoutmp_subscript, such ascollections.deque.
- AOTInductor fallback ops support for grid sampler operations: C-shim support is added for
aten.grid_sampler_3d,aten.grid_sampler_3d_backward,aten.cudnn_grid_sampler, andaten.cudnn_grid_sampler_backwardoperations to the AOTInductor fallback ops. This prevents fallback to the proxy executor and enables correct execution throughtorch.compile(backend='inductor')on CPU.
- Fix CUDA IPC deserialization mismatch: The pull request ensures that the handle type used in
ExpandableSegment::share()is properly communicated and recognized during deserialization inExpandableSegment::fromShared(). This prevents errors caused by uninitialized handle type states in newly spawned consumer processes.
- Pure-Python polyfill for unittest.TestCase.assertRaisesRegex in TorchDynamo: A pure-Python polyfill is added to bypass untraceable C-extension calls during regex assertions in TorchDynamo. This enables successful tracing of standard CPython 3.13 tests like
test_keyword_argsby redirectingassertRaisesRegexto the traceableassertRaiseswhile preserving the core test intent.
3.3 Pull Request Discussion Insights
This section will analyze the tone and sentiment of discussions within this project's open and closed pull requests that occurred within the past week. It aims to identify potentially heated exchanges and to maintain a constructive project environment.
Based on our analysis, there are no instances of toxic discussions in the project's open or closed pull requests from the past week.
IV. Contributors
4.1 Contributors
Active Contributors:
We consider an active contributor in this project to be any contributor who has made at least 1 commit, opened at least 1 issue, created at least 1 pull request, or made more than 2 comments in the last month.
If there are more than 10 active contributors, the list is truncated to the top 10 based on contribution metrics for better clarity.
| Contributor | Commits | Pull Requests | Issues | Comments |
|---|---|---|---|---|
| bobrenjc93 | 201 | 0 | 0 | 0 |
| anijain2305 | 163 | 0 | 0 | 0 |
| huydhn | 65 | 0 | 0 | 0 |
| malfet | 53 | 0 | 0 | 0 |
| yushangdi | 50 | 0 | 0 | 0 |
| aorenste | 46 | 0 | 0 | 0 |
| daisyden | 43 | 0 | 0 | 0 |
| weifengpy | 33 | 1 | 0 | 0 |
| colesbury | 31 | 1 | 0 | 0 |
| fxdawnn | 27 | 1 | 0 | 0 |