Weekly GitHub Report for Xla: January 16, 2026 - January 23, 2026 (21:05:03)

Weekly GitHub Report for Xla: January 16, 2026 - January 23, 2026 (21:05:03)

        Weekly GitHub Report for Xla
Thank you for subscribing to our weekly newsletter! Each week, we deliver a comprehensive summary of your GitHub project's latest activity right to your inbox, including an overview of your project's issues, pull requests, contributors, and commit activity.

Table of Contents

I. News
1.1. Recent Version Releases
1.2. Other Noteworthy Updates

II. Issues
2.1. Top 5 Active Issues
2.2. Top 5 Stale Issues
2.3. Open Issues
2.4. Closed Issues
2.5. Issue Discussion Insights

III. Pull Requests
3.1. Open Pull Requests
3.2. Closed Pull Requests
3.3. Pull Request Discussion Insights

IV. Contributors
4.1. Contributors

I. News
1.1 Recent Version Releases:
No recent version releases were found.
1.2 Version Information:
Please provide the version release information you would like me to analyze and summarize.

II. Issues
2.1 Top 5 Active Issues:
We consider active issues to be issues that that have been commented on most frequently within the last week. Bot comments are omitted. 

[ERR:BUILD] ./configure.py --backend=CPU does not setup all required packages: This issue reports a build failure when running ./configure.py --backend=CPU due to a missing BUILD file required by a .bzl file in the ROCm local configuration, which prevents the package from loading correctly. The user expected the configuration to work based on the developer guide but encounters an error indicating that every .bzl file must have a corresponding package, highlighting a potential misconfiguration or missing file in the repository.  

The comment suggests that the problem might be related to the --noenable_bzlmod flag but notes that the "rocm" repository does not exist in the main workspace file, implying the issue could stem from an absent or misconfigured ROCm setup.
Number of comments this week: 1

[ERR:BUILD] bazel build //xla/... fails in OSS build due to missing xla_data_proto_py_pb2 target: This issue describes a build failure when running bazel build //xla/... on the XLA repository due to a missing target xla_data_proto_py_pb2 that is required by the //xla/python/tools:types target but is not declared in the xla/BUILD file. The problem arises because the missing target is conditionally included via copybara:uncomment blocks and the build tag filters only exclude the target at build time, not during the analysis phase, causing Bazel to fail resolving dependencies.  

The single comment simply tags another user to bring attention to the issue without providing additional information or solutions.
Number of comments this week: 1

Since there were fewer than 5 open issues, all of the open issues have been listed above.
2.2 Top 5 Stale Issues:
We consider stale issues to be issues that has had no activity within the last 30 days. The team should work together to get these issues resolved and closed as soon as possible. 
As of our latest update, there are no stale issues for the project this week. 
2.3 Open Issues
This section lists, groups, and then summarizes issues that were created within the last week in the repository. 
Issues Opened This Week: 3
Summarized Issues:

Build configuration and missing dependencies: Several issues highlight problems with build configurations causing missing dependencies and build failures. One issue reports that running ./configure.py --backend=CPU does not set up all required packages, leading to missing BUILD files for Bazel packages related to @local_config_rocm. Another issue describes a missing xla_data_proto_py_pb2 target that is required during analysis but is conditionally excluded, causing dependency resolution errors during the build process.
[issues/36692, issues/36720]

Test suite execution challenges: There are difficulties running the XLA test suite in an open-source environment, particularly on Linux with Bazel and Clang. Most tests are skipped or fail to build despite attempts to adjust filters and configurations, and users seek guidance on the intended way to run these tests and which tests or filters are appropriate for OSS builds.
[issues/36756]

2.4 Closed Issues
This section lists, groups, and then summarizes issues that were closed within the last week in the repository. This section also links the associated pull requests if applicable. 
Issues Closed This Week: 0
Summarized Issues:
As of our latest update, there were no issues closed in the project this week.
2.5 Issue Discussion Insights
This section will analyze the tone and sentiment of discussions within this project's open and closed issues that occurred within the past week. It aims to identify potentially heated exchanges and to maintain a constructive project environment. 
Based on our analysis, there are no instances of toxic discussions in the project's open or closed issues from the past week. 

III. Pull Requests
3.1 Open Pull Requests
This section provides a summary of pull requests that were opened in the repository over the past week. The top three pull requests with the highest number of commits are highlighted as 'key' pull requests. Other pull requests are grouped based on similar characteristics for easier analysis. Up to 25 pull requests are displayed in this section, while any remaining pull requests beyond this limit are omitted for brevity.

Pull Requests Opened This Week: 14
Key Open Pull Requests
1. [ROCm] Enable backends/gpu/autotuner unit tests on ROCm: This pull request enables the autotuner unit tests for the GPU backend to run on the ROCm platform by making them platform-independent through the removal of direct NVPTXCompiler dependencies and ensuring tests like custom_kernel_test, triton_test, native_emitter_test, and fission_backend_test pass or are appropriately skipped on ROCm, thereby expanding ROCm test coverage.

URL: pull/36553

Associated Commits: 08bc5, 995d7, 7fcfb, e25cc, 27b72, d501c

2. [GPU] Add default perf table for GB200/GB300: This pull request adds default performance tables for GB200 and GB300 GPUs and updates the data fetching logic to differentiate between B200 and GB200, enabling improved out-of-the-box performance within the NVLink domain on GB machines when using the static cost model.

URL: pull/36518

Associated Commits: 98dd7, 85321, b1550, 37992

3. [xla:gpu] Use command buffer resources to track command executor record state: This pull request enhances the GPU backend of the XLA project by using command buffer resources to track the command executor's record state, introduces a record ID to differentiate multiple recordings of the same executor, and updates the related documentation accordingly.

URL: pull/36521

Associated Commits: c3e72, 80e62

Other Open Pull Requests

Memory management and offloading fixes: These pull requests fix handling of memory space information for outputs offloaded to host memory and address race conditions by making variables atomic to ensure thread safety. They enable workloads like MaxText to offload computations correctly and improve stability in ROCm code.

[pull/36525, pull/36564]

GPU backend improvements and bug fixes: Multiple pull requests enhance the GPU backend by fixing build breaks, adding tests for cuDNN convolution, and improving logging with device ordinals and ranks for collective operations. These changes increase test coverage, debugging clarity, and overall backend robustness.

[pull/36599, pull/36609, pull/36721]

SYCL and oneAPI executor enhancements: These pull requests introduce a stub BLAS plugin for SYCL and improve the SYCL stream executor by populating device descriptions using the Level Zero API. They lay groundwork for future BLAS support and integrate detailed device info into the oneAPI backend.

[pull/36704, pull/36708]

Collective operations and communication improvements: Pull requests in this topic integrate the NCCL put host API into collective permute operations for better performance and fix the CollectivePipelineParallelismTest by adding missing attributes to pipelined send/receive operations. These changes optimize communication and prevent deadlock checks from failing incorrectly.

[pull/36616, pull/36700]

3.2 Closed Pull Requests
This section provides a summary of pull requests that were closed in the repository over the past week. The top three pull requests with the highest number of commits are highlighted as 'key' pull requests. Other pull requests are grouped based on similar characteristics for easier analysis. Up to 25 pull requests are displayed in this section, while any remaining pull requests beyond this limit are omitted for brevity.
Pull Requests Closed This Week: 24
Key Closed Pull Requests
1. Add docs for debugging OOM errors with XProf: This pull request proposes adding a new documentation page that guides users on debugging out-of-memory (OOM) errors using XProf's memory viewer, including updates to the introduction, images, and content for clarity.

URL: pull/36275

Associated Commits: 31984, d4e72, 99970, d3452, 4310f, 08473

Associated Commits: 31984, d4e72, 99970, d3452, 4310f, 08473

2. [xla:gpu] Add shared cancellation token to GPU comms and clique: This pull request introduces a shared cancellation token to the GPU communications and clique components within the xla::gpu collectives library, implementing a first-class CancellationToken and ensuring it is properly allocated and passed to all communications associated with a clique.

URL: pull/36673

Associated Commits: aa973, a77c4, fc426, 3ffba, 23916, d7e2b

Associated Commits: aa973, a77c4, fc426, 3ffba, 23916, d7e2b

3. [ROCm][XLA:GPU] Add ROCm-specific tuning for transpose emitter: This pull request introduces ROCm-specific tuning for the transpose emitter by using different thread and vectorization parameters tailored for AMD GPUs, enhancing the performance of the transpose kernel on AMD hardware while preserving existing behavior on NVIDIA and other platforms.

URL: pull/36413

Associated Commits: e2f5b, d8685, d42d8

Associated Commits: e2f5b, d8685, d42d8

Other Closed Pull Requests

GPU Clique and Collective Communication Improvements: Multiple pull requests focus on enhancing GPU clique management and collective communication in XLA. These include removing deprecated global config flags, adding end-to-end tests for collective cliques with lazy device communicator allocation, consolidating GPU collectives tests into a generic API without NCCL/RCCL dependencies, and proposing removal of unsafe persistent collective cliques to reduce complexity and deadlocks.  
pull/36632, pull/36417, pull/36663, pull/36388

GPU Autotuning and Experimental Features: Several pull requests introduce and expand GPU autotuning capabilities and experimental flags. This includes adding an experimental flag to autotune all fusions with the Triton backend, and enabling Tensor Memory Access (TMA) configurations in GPU autotuning without impacting performance or existing behavior.  
pull/36345, pull/36401

Backend and Platform Support Enhancements: Pull requests address backend improvements such as implementing a generic asynchronous memcpy for the SYCL backend to support profiling, fixing MIOpen linking issues for RNN kernels on ROCm, and adding support for the s390x architecture in CPU intrinsic device detection to enable specialized implementations for IBM Z platforms.  
pull/36473, pull/36418, pull/36482

Code Quality, Formatting, and Tooling Updates: Several pull requests improve code quality and tooling by adding IWYU pragma export directives, updating .clang-format configurations to include riegeli headers and adjust sort priorities, wrapping the hlo_extractor tool into a command line API, and proposing replacement of absl errors with xla/util errors for better stack trace capture.  
pull/36471, pull/36722, pull/36467, pull/36633

Test Fixes and CI/CD: Some pull requests fix failing tests and improve CI/CD processes. These include correcting buffer update offsets to fix a failing test, increasing tensor dimensions in reduction emitter tests to prevent segmentation faults on ROCm, and adding a test to trigger the CI/CD pipeline without merging functional changes.  
pull/36524, pull/36612, pull/36744

Logging and Documentation Improvements: A few pull requests aim to reduce logging noise in the GPU component and add documentation for development environment setup.  
pull/36710, pull/36469

Process Identification Migration: One pull request migrates distributed process identification from integer-based node IDs to strongly-typed ProcessId, renaming nodes to processes to better reflect multi-process setups on the same node.  
pull/36529

Utility Function Enhancements: A pull request proposes adding an overload for the HumanReadableElapsedTime function to accept an absl::Duration parameter, enhancing utility function flexibility.  
pull/36522

3.3 Pull Request Discussion Insights
This section will analyze the tone and sentiment of discussions within this project's open and closed pull requests that occurred within the past week. It aims to identify potentially heated exchanges and to maintain a constructive project environment. 
Based on our analysis, there are no instances of toxic discussions in the project's open or closed pull requests from the past week. 

IV. Contributors
4.1 Contributors
Active Contributors:
We consider an active contributor in this project to be any contributor who has made at least 1 commit, opened at least 1 issue, created at least 1 pull request, or made more than 2 comments in the last month. 
If there are more than 10 active contributors, the list is truncated to the top 10 based on contribution metrics for better clarity.

Contributor
Commits
Pull Requests
Issues
Comments

ezhulenev
64
17
1
6

alekstheod
14
1
0
3

bhavani-subramanian
15
2
0
0

pavithraes
8
1
0
0

mwhittaker
0
0
0
9

Tixxx
5
3
0
0

Eetusjo
6
1
0
0

nurmukhametov
4
2
0
0

shawnwang18
5
0
0
0

i-chaochen
3
1
0
1

                            Don't miss what's next. Subscribe to Weekly Project News:

                        https://github.com/owner/public_repo (required)

            Email address (required)

Contributor	Commits	Pull Requests	Issues	Comments
ezhulenev	64	17	1	6
alekstheod	14	1	0	3
bhavani-subramanian	15	2	0	0
pavithraes	8	1	0	0
mwhittaker	0	0	0	9
Tixxx	5	3	0	0
Eetusjo	6	1	0	0
nurmukhametov	4	2	0	0
shawnwang18	5	0	0	0
i-chaochen	3	1	0	1