Weekly Project News

Subscribe
Archives

Weekly GitHub Report for Pytorch: May 26, 2025 - June 02, 2025 (12:00:59)

Weekly GitHub Report for Pytorch

Thank you for subscribing to our weekly newsletter! Each week, we deliver a comprehensive summary of your GitHub project's latest activity right to your inbox, including an overview of your project's issues, pull requests, contributors, and commit activity.


Table of Contents

  • I. News
    • 1.1. Recent Version Releases
    • 1.2. Other Noteworthy Updates
  • II. Issues
    • 2.1. Top 5 Active Issues
    • 2.2. Top 5 Stale Issues
    • 2.3. Open Issues
    • 2.4. Closed Issues
    • 2.5. Issue Discussion Insights
  • III. Pull Requests
    • 3.1. Open Pull Requests
    • 3.2. Closed Pull Requests
    • 3.3. Pull Request Discussion Insights
  • IV. Contributors
    • 4.1. Contributors

I. News

1.1 Recent Version Releases:

The current version of this repository is v2.6.0

1.2 Version Information:

Released on January 29, 2025, PyTorch 2.6 introduces significant updates, including support for torch.compile with Python 3.13, a new performance-related feature torch.compiler.set_stance, and enhancements to AOTInductor. Notably, this version also adds FP16 support on X86 CPUs and marks a shift away from publishing on Conda, directing users to alternative package sources.

II. Issues

2.1 Top 5 Active Issues:

We consider active issues to be issues that that have been commented on most frequently within the last week. Bot comments are omitted.

  1. MPS Memory Leak: This issue reports a memory leak when using the Metal Performance Shaders (MPS) backend in PyTorch, where memory usage steadily increases over time during model training, unlike the stable memory usage observed with the CPU backend. The problem is demonstrated with two minimal scripts, highlighting specific lines of code that seem to contribute to the memory leak, and the issue persists even when using the latest nightly build of PyTorch.

    • The comments discuss attempts to reproduce the issue, with some users unable to replicate the memory leak while others confirm its presence. Suggestions include checking memory statistics, trying different PyTorch versions, and examining potential causes related to MPS's memory management. A potential fix is identified involving MPSGraph caching, and a pull request is mentioned to address part of the memory growth, though some memory behavior is attributed to macOS's handling of autorelease pools.
    • Number of comments this week: 17
  2. bfloat16 Conv2d slower than float16 on 4090: This issue reports that on an NVIDIA 4090 GPU, the Conv2d operation using bfloat16 precision is consistently slower than when using float16, which the user did not expect based on available documentation. The user provides a test script and environment details to illustrate the performance discrepancy and seeks clarification on whether this behavior is expected.

    • The comments reveal that this performance difference is expected on consumer GPUs like the 4090 due to better optimization for FP16, while BF16 is more optimized on data center GPUs. Suggestions include trying different settings like torch.backends.cudnn.benchmark_limit = 0 and using channels-last memory layout, but these did not yield significant performance improvements. Further investigation is suggested to determine if the issue is due to heuristics or kernel coverage.
    • Number of comments this week: 6
  3. torch.compile regression: it cause recompile when int value changed: This issue describes a regression in the torch.compile function, which causes unnecessary recompilation when an integer value changes, negatively impacting performance. The problem arises when an integer value, which changes every epoch, is passed directly to the model, leading to recompilation issues that were not present in earlier versions of the library.

    • The comments discuss the issue of recompilation caused by a specific commit, with requests for a unit test using pure PyTorch to help diagnose the problem. A simple example is provided to reproduce the issue, and a suspected commit is identified as the cause. There is a suggestion to explore methods to avoid recompilation when an integer value is involved, and a script is shared to demonstrate the recompilation problem, highlighting the changes before and after the commit.
    • Number of comments this week: 6
  4. [BUG] DataLoader low GPU utilization and extremely slow compared to manual batching: This issue highlights a significant performance discrepancy between using PyTorch's DataLoader and manual batching, where DataLoader exhibits low GPU utilization and is substantially slower, especially when using bfloat16 data types. The user provides a reproducible code sample demonstrating that DataLoader is 7-22x or even 50x slower than direct data access, despite attempts to optimize DataLoader settings.

    • The comments discuss the impact of large batch sizes and the effect of shuffling on performance, with suggestions to adjust DataLoader parameters like shuffle, num_workers, and prefetch_factor to improve speed. Despite these adjustments, the DataLoader remains significantly slower, with one commenter noting a 600% slowdown and low GPU utilization. The conversation also touches on the potential for DataLoader to perform unnecessary tasks for simple use cases, and the challenges of balancing performance with memory usage.
    • Number of comments this week: 5
  5. grouped_mm optional zero initialization of the output: This issue discusses the optional zero initialization of the output tensor in the grouped_mm kernel, which currently allocates the output without initialization, potentially leading to uninitialized sections when using operations like scatter_add(). The proposal suggests adding functionality to initialize all or only the uninitialized parts of the output with zeros to prevent unintended behavior, especially when padding is involved due to alignment requirements.

    • The comments include requests for more detailed examples and discussions about the performance impact of zero-initialization, with one user noting a significant performance degradation in a specific use case. Another comment suggests using embedding_backward for ignored indices and criticizes the inefficiency of the current implementation, highlighting unnecessary data copying and inefficient handling of indices.
    • Number of comments this week: 5

2.2 Top 5 Stale Issues:

We consider stale issues to be issues that has had no activity within the last 30 days. The team should work together to get these issues resolved and closed as soon as possible.

  1. ImportError: cannot import name 'triton_key' from 'triton.compiler.compiler': This issue involves an ImportError encountered when attempting to import 'triton_key' from 'triton.compiler.compiler', which is causing a backend compiler failure in a PyTorch environment. The error occurs within a script that utilizes the OotdPipeline and attempts to compile certain components with Torch's compile function, specifically when using the 'inductor' backend, and is likely related to compatibility or versioning issues with the Triton library.
  2. Alternate algorithm for computing MaxPool2D under specific condition.: This issue proposes an alternative algorithm for computing the MaxPool2D operation in PyTorch when the stride is equal to 1, suggesting that a kernel size of 5 can be represented by two MaxPool2D operations with a kernel size of 3, and similarly for other kernel sizes. The motivation behind this approach is to reduce computational costs on the CPU by modifying the MaxPool2D layer directly, as demonstrated by testing code that shows a significant speedup in execution time.
  3. cuda_utils.so: failed to map segment from shared object: This issue involves a bug encountered when running a model in a Docker environment with a tmpfs permission set to 1777, where the execution of the cached cuda_utils.so file in the /tmp directory fails due to the absence of the execution bit, despite the directories having the correct permissions. The error occurs during the execution of a PyTorch model, specifically when using the torch.compile function, and is related to the inability to map a segment from the shared object, which is crucial for the model's operation.
  4. Enable UFMT on all files in PyTorch: This issue involves enabling uniform formatting (UFMT) across all files in the PyTorch codebase, as currently, approximately 1,500 files are not formatted according to the UFMT standards. The process requires removing file names from the exclude_patterns in the UFMT section of the .lintrunner.toml file and running a specific command to apply the formatting, with additional preparatory work needed to resolve known issues such as import cycles and misplaced annotations before the UFMT changes can be committed.
  5. [JIT archive] Add a flag to not include debug files: This issue addresses the need for a feature in the PyTorch library that allows users to exclude debug files when saving models using the torch.jit.save() function, as these files significantly increase the size of the saved model without affecting its functionality. The proposal suggests adding a flag to the function to prevent the inclusion of .debug_pkl files, which are primarily used for debugging purposes, thereby reducing the file size and making it more suitable for deployment on resource-constrained devices like mobile phones.

2.3 Open Issues

This section lists, groups, and then summarizes issues that were created within the last week in the repository.

Issues Opened This Week: 92

Summarized Issues:

  • Bugs in PyTorch's torch.distributed and torch.fake_quantize_per_tensor_affine functions: These issues highlight bugs in PyTorch's torch.distributed.checkpoint.state_dict.get_model_state_dict and torch.fake_quantize_per_tensor_affine functions. The former fails to update _metadata keys after removing the module. prefix, while the latter handles +inf inconsistently between CPU and CUDA devices, mapping it incorrectly on CPU.
    • issues/154327, issues/154328
  • Memory and Performance Issues in PyTorch: PyTorch faces a memory leak issue with the Metal Performance Shaders (MPS) backend on macOS, leading to increased memory usage over time. Additionally, a performance discrepancy is noted where Conv2d using bfloat16 is slower than float16 on an NVIDIA RTX 4090 GPU.
    • issues/154329, issues/154351
  • Enhancements and Proposals for PyTorch: There are proposals to implement Jacobian-vector product for flex attention and to remove explicit backend references from torch.distributed. These enhancements aim to improve performance in few-step diffusion models and simplify usage by inferring backend from device_id.
    • issues/154332, issues/154345
  • Bugs in PyTorch's torch.qr and torch.prod functions: PyTorch's torch.qr function with the out parameter causes crashes, suggesting a switch to torch.linalg.qr. Similarly, using .prod(dtype=torch.int16) with autograd results in an internal assertion failure, indicating issues with differentiable types.
    • issues/154356, issues/154357
  • Memory Inefficiency and Graph Breaks in PyTorch: PyTorch's scaled_dot_product_attention function shows significant memory inefficiency in a Generalized Query Attention setting. Additionally, a graph break occurs during torch compilation of a forward pass in an attention layer using flash attention.
    • issues/154363, issues/154365
  • Errors and Crashes in PyTorch's TorchScript and Sparse Operations: PyTorch's TorchScript raises a RuntimeError with view(dtype) due to invalid shape transformations. Additionally, torch.sparse.softmax crashes due to deprecated torch.sparse.FloatTensor with invalid indices.
    • issues/154407, issues/154419
  • Bugs in PyTorch's torch.fmod and torch.remainder functions: Using torch.fmod and torch.remainder with uint8 tensors causes a ZeroDivisionError in eager mode and a crash with a "Floating point exception" when compiled using the Inductor backend.
    • issues/154420
  • Segmentation Faults in PyTorch Functions: Segmentation faults occur in torch.fx.experimental.partitioner_utils.map_arg and torch.jit.ignore with drop=True, highlighting issues with recursive data structures and TorchScript compilation.
    • issues/154422, issues/154423
  • Segmentation Faults in PyTorch's Sparse Operations: Segmentation faults occur in torch.matmul and torch.sparse.addmm with sparse CSR tensors due to invalid crow_indices, suggesting the use of check_invariants=True to identify input errors.
    • issues/154424
  • Floating Point Exception in PyTorch's ONNX Export: A floating point exception occurs during torch.onnx.export with a PixelShuffle layer, likely due to an incorrect input tensor shape with zero input channels.
    • issues/154425
  • GitHub Bug Affecting PyTorch's Pytorchbot: The pytorchbot incorrectly identifies a pull request as already merged due to a GitHub bug, suggesting a flaw in the logic or GraphQL queries.
    • issues/154427
  • Discrepancies in PyTorch's Sine Function: A significant discrepancy is noted in the output of the sine function when applied to the exponential of a tensor on CPU versus CUDA, particularly for very large arguments.
    • issues/154428
  • Bugs in PyTorch's torch._dynamo.optimize and torch.jit.script functions: torch._dynamo.optimize incorrectly converts a torch.Size object to a tuple, while torch.jit.script raises a RuntimeError with view(dtype) due to invalid shape transformations.
    • issues/154432, issues/154407
  • Proposals for PyTorch's Dependency Management: A proposal suggests removing the ideep git submodule dependency and directly integrating the oneDNN API to improve transparency and efficiency, particularly for x86_64, aarch64 CPUs, and Intel GPUs.
    • issues/154444
  • Failures in PyTorch's Inductor Jobs: Inductor jobs fail related to the opacus_cifar10 test after updating to opacus version 1.5.4, as indicated by a specific pull request and a linked HUD error report.
    • issues/154446
  • Bugs in PyTorch's Android Libraries: PyTorch Android Torch Vision and PyTorch Lite libraries do not support the required 16KB page size alignment mandated by Google for apps targeting Android 15+ from November 1st, 2025.
    • issues/154449
  • Logging and Cache Issues in PyTorch's Dynamo: The logging system for re-raising exceptions in PyTorch's Dynamo is confusing due to duplicate graph breaks and chained exceptions. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154454, issues/154456
  • Regression in PyTorch's Slow-Autograd Tests: A regression in the slow-autograd tests is identified, prompting a pull request to exclude the problematic test file from running slow gradcheck.
    • issues/154459
  • Inconsistent Behavior in PyTorch's Pytorchbot: The pytorchbot experiences inconsistent behavior when handling interactively rebased commits, resulting in incorrect commit references.
    • issues/154461
  • Bugs in PyTorch's torch.compiler.save_cache_artifacts() API: The torch.compiler.save_cache_artifacts() API fails due to an ImportError caused by a circular import in the torch.compiler._cache module.
    • issues/154463
  • Kernel Hash Key Issues in PyTorch's Inductor: A kernel hash key for the ChoiceCaller in PyTorch's Inductor is needed to be independent of runtime parameters to prevent incorrect cache usage.
    • issues/154467
  • Inconsistencies in PyTorch's torch.fft.fft and torch.fft.irfft functions: The torch.fft.fft function produces inconsistent results for infinite input values between CPU and GPU devices. Similarly, torch.fft.irfft handles non-Hermitian inputs inconsistently across CPU and GPU.
    • issues/154474, issues/154496
  • Regression in PyTorch's FFT Operators: A regression in FFT operators occurs when using the 2024 version of MKL, resulting in an error due to inconsistent configuration parameters.
    • issues/154477
  • Bugs in PyTorch's torch.jit.trace_module and torch.svd_lowrank functions: The torch.jit.trace_module API fails to respect __jit_ignored_attributes__, while torch.svd_lowrank produces inconsistent U matrix outputs on CPU versus CUDA.
    • issues/154478, issues/154479
  • Feature Request for PyTorch's Compiler: A feature request suggests enhancing the PyTorch compiler's ability to infer data-dependent information from tensor constructor calls to prevent data-dependent errors.
    • issues/154489
  • Regression in PyTorch's torch.compile Function: A regression in the torch.compile function leads to unnecessary recompilation when changing an integer value during each epoch of model generation.
    • issues/154490
  • Bugs in PyTorch's torch.randn and torch.add Functions: Using torch.randn with device='mkldnn' results in an "INTERNAL ASSERT FAILED" error. Additionally, torch.add and torch.sub return incorrect results for complex128 tensors with infinite components.
    • issues/154491, issues/154501
  • Bugs in PyTorch's torch.fft.irfft and torch.add Functions: The torch.fft.irfft function handles non-Hermitian inputs inconsistently across CPU and GPU. Similarly, torch.add and torch.sub return incorrect results for complex128 tensors with infinite components.
    • issues/154496, issues/154501
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.compile and torch.export.export Functions: A NotImplementedError occurs when running torch.compile with flex_attention and NJT inputs. Additionally, torch.export.export fails due to a data-dependent error related to unbacked symbols.
    • issues/154556, issues/154559
  • Bugs in PyTorch's torch.export.export and torch.jit.script Functions: The torch.export.export function generates invalid code for Tensor.split with the meta device. Additionally, the torch.jit.script function does not recognize axis as an alias for dim.
    • issues/154721, issues/154613
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged warmup times.
    • issues/154613, issues/154456
  • Bugs in PyTorch's torch.jit.script and torch.compile Functions: The torch.jit.script function does not recognize axis as an alias for dim, causing runtime errors. Additionally, torch.compile does not utilize the cache effectively, leading to prolonged

2.4 Closed Issues

This section lists, groups, and then summarizes issues that were closed within the last week in the repository. This section also links the associated pull requests if applicable.

Issues Closed This Week: 19

Summarized Issues:

  • RuntimeError with MaxUnpool2d in PyTorch: This issue involves a RuntimeError encountered when using the MaxUnpool2d function in PyTorch. The error arises due to a non-contiguous tensor being used in a backward pass, which can be resolved by ensuring the tensor is contiguous before setting requires_grad to True.
    • issues/154341
  • NCCL Communication Failures on NVIDIA H100 GPUs: A bug where NCCL communications fail with an internal error when using PyTorch version 2.7.0 on a system with 8 NVIDIA H100 GPUs. The issue was resolved by manually adding arch and vendor fields to the topology XML file provided by AWS Hyperpod configuration.
    • issues/154342
  • Shared State in Wishart Distribution Instances: This issue describes a bug in the PyTorch library where creating a second instance of the Wishart distribution modifies the constraints on the first instance. This occurs due to the arg_constraints being defined as a class variable, resulting in shared state across instances.
    • issues/154355
  • Compatibility Issue with torch.get_default_device(): An error is encountered when attempting to load a model using the transformers library due to a missing function torch.get_default_device(). The solution is to upgrade PyTorch to version 1.13 or newer to resolve the compatibility issue.
    • issues/154362
  • Discrepancy in F1-score Across PyTorch Versions: A problem where the user experiences different outputs and a 3% decrease in F1-score when running the same code with different versions of PyTorch. This discrepancy is despite using seed control and consistent data, and the user seeks clarification on PyTorch's lack of guaranteed bitwise reproducibility across versions.
    • issues/154411
  • Release Highlight Feature Testing for PyTorch 2.8.0: This issue involves testing the release highlight feature for a proposed feature in the PyTorch project. Plans include documentation and tutorial submissions, marketing coverage, and testing support, specifically targeting the 2.8.0 release version on Linux.
    • issues/154462
  • Inconsistent Results in torch.fft Functions: Several issues highlight discrepancies in torch.fft functions, such as hfft2, ifft2, and others, where infinite input values result in inconsistent outputs between CPU and GPU devices. These inconsistencies lead to different patterns of inf and nan values in the output tensor for identical inputs.
    • issues/154520, issues/154521
  • Disabled Tests Due to Failures on Main Branch: Multiple tests, such as test_promotes_int_to_float_ldexp_cuda_int16 and test_linear, have been disabled due to consistent failures on the main branch. These issues involve requests to specify affected platforms and contributions from various developers.
    • issues/154550, issues/154684, issues/154760
  • Inconsistent Conversion of torch.inf Values: Several issues highlight bugs in PyTorch where methods like .int(), .char(), and .type_as() produce inconsistent results for torch.inf values between CPU and GPU devices. These discrepancies lead to different numerical results across hardware platforms.
    • issues/154726, issues/154727
  • Bug in flex_attention with Nested Jagged Tensor Inputs: This issue highlights a bug in the flex_attention function when used with Nested Jagged Tensor (NJT) inputs, resulting in inconsistent outputs compared to other data layouts. The problem may be related to the lack of a block_mask for NJT inputs, which should be addressed by using create_nested_block_mask.
    • issues/154554
  • Compilation Error on Windows 11 with CUDA 12.9: A compilation error occurs when attempting to build PyTorch's static library on Windows 11 using CUDA 12.9, due to an LLVM out-of-memory error and an nvcc error. The issue was closed due to lack of actionable information, with a suggestion that the problem might be related to the CUDAToolkit.
    • issues/154604
  • Memory Leakage in TorchTitan llama3 8b Training: This issue involves a memory leakage problem during the training of TorchTitan llama3 8b on H100 machines, specifically when Selective Activation Checkpointing (SAC) is used and torch.compile() is disabled. The leakage leads to out-of-memory (OOM) errors due to reference cycles that include tensors.
    • issues/154642
  • Bug in speculate_subgraph with torch.func.functionalize: This issue pertains to a bug in the PyTorch library where the speculate_subgraph function fails to detect input mutations when a function is wrapped with torch.func.functionalize. The problem is demonstrated by the provided Python code snippet.
    • issues/154669

2.5 Issue Discussion Insights

This section will analyze the tone and sentiment of discussions within this project's open and closed issues that occurred within the past week. It aims to identify potentially heated exchanges and to maintain a constructive project environment.

Based on our analysis, there are no instances of toxic discussions in the project's open or closed issues from the past week.


III. Pull Requests

3.1 Open Pull Requests

This section provides a summary of pull requests that were opened in the repository over the past week. The top three pull requests with the highest number of commits are highlighted as 'key' pull requests. Other pull requests are grouped based on similar characteristics for easier analysis. Up to 25 pull requests are displayed in this section, while any remaining pull requests beyond this limit are omitted for brevity.

Pull Requests Opened This Week: 176

Key Open Pull Requests

1. [ONNX] Implements converter for higher order ops scan: This pull request implements a converter for higher order operations, specifically the "scan" operation, in the ONNX framework within the PyTorch project, addressing issue #151327.

  • URL: pull/154513
  • Merged: No
  • Associated Commits: 22dd1, 11523, e7ab7, 882d5, 249f4, 0b004, 44999, 0f250, 0caa1, ad6bb, 04f01, f169d, dd484, d90f9, 7b457, 45937, 402a3, 1da86, acc6d, ed383, f3aa4, bc23f, 4decd, 69e32, 773b7, 59f15, 082f4, 0880a, 23c60, c11eb, 5d4b5, 4798e, b814f, 03857, b0db1, 11e9b, 8442c, 59e09

2. [ONNX] Create support for rotary embeddings: This pull request introduces support for rotary embeddings in the PyTorch ONNX exporter by registering the RotaryEmbedding operator in the torch.ops.onnx namespace, allowing the exporter to recognize and export ONNX operators, and providing user-friendly, unversioned functions for native use in PyTorch models.

  • URL: pull/154745
  • Merged: No
  • Associated Commits: 7027d, c498a, fe716, 60a4f, 7cf8d, 3afc5, c5199, 13660, 99aa1, c8f84, b8ce1, db32e, 3774f, 4f2ea, 59c70

3. Type hints for distributions/constraints: This pull request introduces type hints to the distributions and constraints modules in the PyTorch project, addressing issues #144196 and #144219, by making several enhancements such as making Independent and MixtureSameFamily generic, annotating attributes, adding type aliases to __all__, and incorporating various other improvements across multiple commits.

  • URL: pull/154711
  • Merged: No
  • Associated Commits: caade, 6fef4, a89be, da8cf, 46c16, b72e7, eaa7b, a9a40, b73dd, e9d8d, 74f20, fb2d6, 7c1ab

Other Open Pull Requests

  • Dynamo Enhancements: This topic involves enhancing the Dynamo component of the PyTorch project to improve its tracing capabilities. The pull requests focus on ensuring that explicit dunder method calls can be traced, as part of a series of updates managed through the ghstack tool.
    • pull/154366
  • CUDA Integration Updates: These pull requests address updates and fixes related to CUDA integration in the PyTorch project. They utilize newer CMake syntax and targets, remove outdated modules, fix MSVC issues, update version references, and adopt the official CUDA module.
    • pull/154595
  • PrecompileContext Implementation: This pull request introduces the PrecompileContext, a specialized CacheArtifactManager for managing precompile artifacts during Torch compilation. It focuses on testing and basic AOTAutograd logic, with future updates planned for dynamo-related features.
    • pull/154415
  • ONNX Exporter Enhancements: The ONNX exporter in PyTorch is enhanced to export the Scaled Dot-Product Attention (SDPA) to the ONNX Attention operator. This includes adding unit tests, duplicating functions for varied opsets, fixing lint issues, and updating relevant files.
    • pull/154596
  • Sparse Tensor Validation Control: A new feature allows users to control the validation of sparse tensor invariants when loading data from external sources. The default setting disables this validation to avoid computational expense.
    • pull/154610
  • Documentation Format Conversion: This pull request involves converting PyTorch's documentation from reStructuredText (.rst) to Markdown (.md). It includes several updates to refine the conversion process.
    • pull/154438
  • Large Indices Data Type Update: An issue with large indices in PyTorch is addressed by changing the data type from torch.int32 to torch.int64. This prevents invalid indexing when the tensor's number of elements exceeds 2^31.
    • pull/154575
  • Intel GPU Support in AOTInductor: Support for Intel GPU's xpu mkldnn operations is added within the AOTInductor framework. This enhancement involves multiple contributors and reviewers.
    • pull/154586
  • Sparse Tensor Pinning Check: The pinning check is disabled when loading sparse tensors to address a specific issue in PyTorch. This change is intended to be merged two weeks after a related pull request.
    • pull/154638
  • Wheel File Reuse: An issue in PyTorch is addressed by reusing an old wheel file and replacing its version. This is indicated by the title and reference to a specific issue number.
    • pull/154773
  • Cudagraph CPU Tensor Support: The issue of cudagraph not supporting CPU tensors is addressed by updating the graph to move CPU tensors to the GPU when beneficial. This involves graph partitioning to enable cudagraphification of remaining GPU operations.
    • pull/154464
  • CI Failures and CUDA 12.8: CI failures related to specific tests are addressed due to a CUDA 12.8 update. The pull request includes fixes for graph breaks and is discussed in a related pull request.
    • pull/154497
  • NVSHMEM Version Addition: NVSHMEM version 3.2.5 is proposed to be added to the PYTORCH_EXTRA_INSTALL_REQUIREMENTS. This version supports both cu11 and cu12 builds, as detailed in the associated commits.
    • pull/154568
  • Symbolic Integers in Graph Partitioning: Symbolic integers (symints) are added to the get_graph_inputs function during graph partitioning. This prevents errors in the codegen_input_symbol_assignment process when tensor shapes involve expressions.
    • pull/154679
  • Guard Overhead Measurement: The issue of inaccurately measured guard overhead during compilation is addressed by flushing the cache. This ensures profiling results are more realistic and consistent with runtime observations.
    • pull/154764
  • Profiler Traces Enhancement: A new event is introduced in profiler traces to efficiently record pre-graph bytecode. This aims to identify models where the pre-graph bytecode is particularly resource-intensive.
    • pull/154769
  • CUDA 12.8 CI Tests: New continuous integration tests for CUDA 12.8 in eager execution mode are introduced. This includes updates to Docker builds, specific tests, and fixes for rebase and linting issues.
    • pull/154469
  • Printf Function Replacement: Certain call sites in PyTorch are proposed to be replaced with the fmtlib printf function. This aims to achieve a faster and memory-safe implementation.
    • pull/154533
  • Compile-Time Performance Improvement: Building the main and kernel code in separate threads is proposed to improve compile-time performance on the CPU. This change maintains no observable changes in the TorchInductor dashboard.
    • pull/154551
  • ConvTranspose3D Support on MacOS: ConvTranspose3D is enabled for FP32 and Complex64 data types on MacOS 14 and 15. Half-precision data types remain unsupported due to discrepancies between CPU and GPU implementations.
    • pull/154696
  • FP8 GEMM Bias Argument: Support for a bias argument in the fp8 GEMM operation within the Cutlass library is introduced. This is part of a series of updates tracked through the ghstack tool.
    • pull/154761
  • Draft Pull Request: A draft pull request in the PyTorch GitHub repository involves multiple updates and contributors. It is part of a stack of changes managed by ghstack but has not yet been merged.
    • pull/154388
  • mm.out Function Migration: The mm.out function is migrated from an out-of-tree implementation to an in-tree one. A new file, MTIAOps.cpp, is added to handle the dispatching of mm.out operations separately.
    • pull/154393
  • Padding Validation Update: The padding validation in max_pool1d, max_pool2d, and max_pool3d functions is updated to correctly account for dilation. This ensures valid padding values are not incorrectly rejected.
    • pull/154395
  • Metal Kernel Migration: Remaining inverse trigonometric and hyperbolic unary operations are moved to metal kernels in PyTorch. This includes specific kernels for inverse tangent, asin, and acos, along with formatting fixes.
    • pull/154465

3.2 Closed Pull Requests

This section provides a summary of pull requests that were closed in the repository over the past week. The top three pull requests with the highest number of commits are highlighted as 'key' pull requests. Other pull requests are grouped based on similar characteristics for easier analysis. Up to 25 pull requests are displayed in this section, while any remaining pull requests beyond this limit are omitted for brevity.

Pull Requests Closed This Week: 170

Key Closed Pull Requests

1. Release 2.6 test distributed spawn failed: This pull request addresses the issue of a failed test related to distributed spawning in the PyTorch project, specifically targeting the release 2.6, by implementing a series of fixes and updates across multiple commits, including the use of a validate-docker-images workflow, adjustments to release-specific configurations, and various code optimizations and bug fixes to ensure compatibility and performance improvements across different platforms and environments.

  • URL: pull/154508
  • Merged: No
  • Associated Commits: c69ea, af92b, aad1c, f3c08, 5363f, 5fbc4, 2b84d, c92f6, 1d3ff, f9e99, 46f55, 0cdf8, 6628b, c953e, 22775, 9b688, 4b9b7, f61bf, 31b52, b1a10, 5eb54, d9eed, 41811, 23e39, f01a6, 929ef, 4e418, 478a9, 7d329, 3a3de, f35ab, 7092d, 8c034, be126, e1858, d155d, 4d9de, a99cc, 51829, 983ea, 47f4e, eb304, 6e304, e2067, 57421, a61b5, 4658a, e19c1, 232eb, 1d2c2, a2639, cd15d, 9c34a, 8d4b8, dcb8a, 7be6b, ca3c3, 32070, 2236d, 1eba9, 93864, 88b97, bbd00, 1f32b, d33dd, f1481, ea546, ac7d6, ed487, 66dfe, 3783d, 8adc1, 5c4fa, 639ee, b445b, 8d72c, 374e5, 6a3b5, e607b, aafc7, d5947, 1b753, ba1ba, 70f30, 3398f, 8354d, 737cf, 4202f, 7c27e, 2e2c7, 3a818, 53ad2, 8eb5d, dbe8c, fcdff, 92b55, f6789, 2e1ed, 13339, 82ac2, 3608e, bfb23, 86b0a, 03714, 34caa, ac032, 5dd61, 7d528, d9a03, 7c072, 73dd0, b08d9, d70a9, 7ad5a, 2fd46, ed8c6, bf084, 20ad8, 2fb0a, 8cfa9, 50a04, 45896, 9d0a4, 1a808, 6fe84, a3632, 68180, fb24f, e53a9, 2cda1, 9d566, a7044, c7ba8, cbd7b, f4c96, 9cf15, 5c42a, 1a150, 1ded2, 2ff80, b6e5f, 50924, 4642c, 5de86, f0b4a, 96c61, 751e4, 684f6, 0afd4, 93ff7, e22ae, 2eff7, 2a634, 90ab5, 26b82

2. [Inductor] Record Triton’s Base32 Cache Key in .best_config for Debugging: This pull request modifies TorchInductor's autotuning flow to include Triton's "base32" cache key in each best_config JSON file, facilitating debugging and analysis by allowing developers to easily match compiled binaries and intermediate representations with their corresponding best configurations, while minimizing impact by adding only an extra field.

  • URL: pull/154618
  • Merged: No
  • Associated Commits: f2460, 7dc8d, 9ee61, bcf84, 09b13, ef40e, 1dc8c, 7c3fb, f9fdc, 163a7, 5c6ae, 96414, 1a4b6, 1af01, 15038, 88b22, a4bc0

3. [upstream triton] support build with setup.py in ./python/ or in ./: This pull request addresses the relocation of the setup.py file in the upstream Triton project from the python/ directory to the root directory (./), enabling the build process to adapt by determining the current working directory based on the new location of setup.py.

  • URL: pull/154635
  • Merged: No
  • Associated Commits: 049bb, 775f8, c934a, 7710e, d0e2a, e37b5, 7457a, 0b9e3, d4b64, 34bf9, 54a1c, 7a842, cc29c, 9719c, 67113, 35ea7

Other Closed Pull Requests

  • Removal of 'allow-untyped-defs' option: Several pull requests aimed to remove the 'allow-untyped-defs' option from various files in the PyTorch project as part of a series of updates managed through the ghstack tool. Despite the efforts, these changes were ultimately not merged into the main codebase.
    • pull/154626, pull/154625, pull/154624, pull/154623, pull/154622
  • Addition of docblocks: Multiple pull requests focused on adding docblocks to various functions and components within the PyTorch project. These changes were part of a larger stack of related updates managed by the ghstack tool, but none of them were merged.
    • pull/154397, pull/154399, pull/154396, pull/154398, pull/154400, pull/154379, pull/154380, pull/154381, pull/154402, pull/154403
  • XPU Compatibility and Enhancements: A pull request aimed to ensure compatibility of the XPU with toolchain version 2025.2, including updates to specific files. Another pull request enhanced unit tests for the elapsed_time function in the XPUEvent class to prevent incorrect elapsed time issues.
    • pull/154359, pull/154494
  • Custom AMD Triton Kernels: A pull request addressed the issue of custom AMD Triton kernels erroring out due to special keyword arguments not appearing in the kernel signature. The proposed solution was to ignore such kwargs when absent, enhancing compatibility with PT2.
    • pull/154605
  • Inductor Lowering Dictionary: A pull request introduced the capability to pass a custom lowering dictionary to the register_lowering() function in Inductor. This change allows systems like Helion to manage their own lowering dictionaries independently of the global lowerings dictionary.
    • pull/154344
  • Common Subexpression Elimination (CSE): A pull request addressed the issue of not performing CSE on unbacked nodes in the PyTorch project. Despite multiple updates and commits, this change was ultimately not merged.
    • pull/154387
  • Multi-architecture Kernel Binaries: A pull request added support for multi-architecture kernel binaries in the "package_cpp_only" mode. This involved generating specific CMake targets to compile .ptx files into .fatbin files and embedding them into the final shared library or binary.
    • pull/154414
  • Tensor Mutation Reflection: A pull request addressed an issue where the system needed to reflect back mutations to the input tensor when a cloned, misaligned tensor is mutated. This ensures that the mutation is preserved even when the tensor's alignment changes.
    • pull/154442
  • Distributed 3D Composability Testing: A pull request involved adding an "h100_distributed" label to facilitate testing of distributed 3D composability on an 8*H100 GPU node. Despite several commits, including label addition and YAML file corrections, it was not merged.
    • pull/154562
  • Memory-efficient Attention in CUDA: A pull request addressed issues in the backward pass of memory-efficient attention for large tensors in CUDA. It fixed identified problems and added support for logsumexp computation in the forward pass.
    • pull/154663
  • Typing and ABI-compatible Dispatching: A pull request focused on improving the typing of a cpp_wrapper interface and preparing for ABI-compatible AOTI C-shim dispatching. It involved removing unnecessary asserts and control flow while ensuring functional neutrality.
    • pull/154371

3.3 Pull Request Discussion Insights

This section will analyze the tone and sentiment of discussions within this project's open and closed pull requests that occurred within the past week. It aims to identify potentially heated exchanges and to maintain a constructive project environment.

  1. test
    • Toxicity Score: 0.55 (Escalating tension, Defensive responses, Unresolved issues)
    • This GitHub conversation involves a series of interactions where username1 expresses frustration over a solution proposed by username2, which did not resolve the issue at hand. The tone of the conversation is initially neutral but becomes tense as username1's dissatisfaction grows. Username2 responds defensively, which further escalates the tension. The conversation is marked by a lack of resolution and increasing frustration from both parties.

IV. Contributors

4.1 Contributors

Active Contributors:

We consider an active contributor in this project to be any contributor who has made at least 1 commit, opened at least 1 issue, created at least 1 pull request, or made more than 2 comments in the last month.

If there are more than 10 active contributors, the list is truncated to the top 10 based on contribution metrics for better clarity.

Contributor Commits Pull Requests Issues Comments
malfet 142 14 9 125
bobrenjc93 201 46 7 28
Skylion007 92 26 3 152
guilhermeleobas 235 16 2 1
laithsakka 80 24 10 34
anijain2305 112 7 1 8
eellison 55 16 2 36
pianpwk 67 16 4 15
henrylhtsang 63 8 8 16
ngimel 34 6 0 55

Don't miss what's next. Subscribe to Weekly Project News:
Powered by Buttondown, the easiest way to start and grow your newsletter.