Weekly GitHub Report for Kubernetes: November 17, 2025 - November 24, 2025 (12:06:25)
Weekly GitHub Report for Kubernetes
Thank you for subscribing to our weekly newsletter! Each week, we deliver a comprehensive summary of your GitHub project's latest activity right to your inbox, including an overview of your project's issues, pull requests, contributors, and commit activity.
Table of Contents
I. News
1.1 Recent Version Releases:
The current version of this repository is v1.32.3
1.2 Version Information:
The Kubernetes version released on March 11, 2025, introduces key updates detailed in the official CHANGELOG, with additional binary downloads available. For comprehensive information on new features and changes, users are encouraged to consult the Kubernetes announce forum and the linked CHANGELOG.
II. Issues
2.1 Top 5 Active Issues:
We consider active issues to be issues that that have been commented on most frequently within the last week. Bot comments are omitted.
-
Conntrack entries cleanup efficiency degradation ( > 1.32 ) causing rule synchronization delays and service access issues: This issue describes a significant degradation in the efficiency of conntrack entries cleanup in Kubernetes versions greater than 1.32, caused by a change from batch deletion using the
conntrack -Dcommand to individual entry deletion via the netlink API. This inefficiency leads to prolonged rule synchronization delays and service access problems, especially under high DNS query loads, resulting in multi-minute cleanup times that block timely service rule updates and cause DNS resolution failures during endpoint rollouts.- The comments discuss that the problem is inherent to the current conntrack cleanup logic, which deletes entries one by one and causes exponential matching overhead due to multiple filters. It is clarified that switching proxy modes (iptables vs nftables) does not resolve the issue since both use the same slow cleanup method. Contributors share detailed profiling results pinpointing the exponential calls to the matching function as the main bottleneck and propose optimizing the filter logic by consolidating multiple filters into a single efficient filter, with one contributor reporting a local improvement reducing cleanup time from 17 seconds to 1 second for 50,000 entries and planning to submit a PR.
- Number of comments this week: 12
-
[NodeDeclaredFeatures] GuaranteedQoSPodCPUResize version skew problem: This issue addresses a version skew problem with the
GuaranteedQoSPodCPUResizenode declared feature (NDF) in Kubernetes, where the feature's enforcement depends on both a feature gate and node configuration, causing inconsistent behavior during cluster upgrades. Specifically, nodes running older versions without the declared feature can cause regressions when the control plane is upgraded and the feature becomes beta and enabled by default, leading to unexpected rejections of CPU resize requests for guaranteed pods.- The comments discuss the root cause being the interplay between feature gates and node declared features across versions, with suggestions to invert the logic of feature enforcement to reject resizes based on a "disabled" feature rather than requiring a feature to be set. Participants consider the implications of this approach, the limitations of NDF in this use case, and the acceptability of shifting error detection earlier in the admission process, concluding that while the current behavior is inconsistent, it should stabilize once NDF is fully enabled on all nodes.
- Number of comments this week: 11
-
Inconsistent behavior when switching between Headless and ClusterIP Services: This issue describes an inconsistency in Kubernetes service behavior when switching a Service from Headless (
clusterIP: None) to a regular ClusterIP type. Specifically, modifying a Headless Service by removing or changing theclusterIPfield appears to succeed without error, but the change does not actually take effect, whereas the reverse operation (regular to Headless) correctly triggers a validation error.- The comments clarify that the observed behavior is due to how the API server handles patches: removing the
clusterIPfield in a patch does not actually delete it but retains the original value, so no real change occurs and no error is raised. It is confirmed that theclusterIPfield is immutable once set, and attempts to assign a new IP fail with validation errors. The discussion concludes that this is working as intended, though it can be confusing to users expecting consistent validation or automatic reassignment during updates. - Number of comments this week: 9
- The comments clarify that the observed behavior is due to how the API server handles patches: removing the
-
HandleError: broken downstream handler implementations: This issue addresses the problem of downstream error handler implementations not fully adapting to a Kubernetes API change introduced in version 1.31, which added new parameters to error handling functions. The proposal suggests breaking the API again in version 1.36 to introduce a more structured
ErrorHandlerinterface using anErrLogtype, aiming to prevent crashes caused by nil errors and to encourage downstream consumers to properly update their implementations.- The comments reflect general agreement on the problem and the proposed structured API change, but some express concern that breaking the API again might be worse than occasional crashes; instead, expanding documentation to clarify that nil errors are not caller bugs is suggested. There is consensus on making downstream consumers responsible for fixing their handlers, with offers to help file bug reports and provide code snippets to ease the transition, and no objections to postponing a PR until feedback is fully gathered.
- Number of comments this week: 7
-
DATA RACE: test/integration/volumescheduling TestVolumeBindingRescheduling: This issue reports a data race detected in the TestVolumeBindingRescheduling integration test, specifically within the structured-merge-diff library used by the Kubernetes apiserver. The problem involves concurrent read and write operations on a map structure, and a fix has been made upstream in the structured-merge-diff repository but still needs to be vendored into Kubernetes and included in an upcoming release.
- The comments identify the exact lines in the code where the race occurs, confirm that the bug was fixed upstream, discuss tagging a new release of the dependency, agree on including the fix in Kubernetes version 1.35, and track progress with a pull request to vendor the updated library and resolve the issue.
- Number of comments this week: 7
2.2 Top 5 Stale Issues:
We consider stale issues to be issues that has had no activity within the last 30 days. The team should work together to get these issues resolved and closed as soon as possible.
- Zone-aware down scaling behavior: This issue describes a problem with the zone-aware downscaling behavior of Kubernetes Horizontal Pod Autoscaler (HPA) workloads, where during scale-in events the distribution of pods across availability zones becomes unbalanced despite using topology spread constraints with
maxSkew: 1. Specifically, the reporter observes that pods are unevenly terminated, resulting in one zone having significantly fewer pods and causing high CPU usage on the remaining pod in that zone, which contradicts the expected behavior of maintaining an even pod spread across zones. - apimachinery's unstructured converter panics if the destination struct contains private fields: This issue describes a panic occurring in the apimachinery's DefaultUnstructuredConverter when it attempts to convert an unstructured object into a destination struct that contains private (non-exported) fields. The reporter highlights that the converter should ideally skip these private fields instead of panicking, as this problem arises notably with protobuf-generated gRPC structs that include private fields for internal state, causing the conversion process to fail unexpectedly.
- Integration tests for kubelet image credential provider: This issue proposes adding integration tests for the kubelet image credential provider, similar to the existing tests for client-go credential plugins. It suggests that since there are already integration tests for pod certificate functionality, implementing tests for kubelet credential plugins would be a logical and beneficial extension.
- conversion-gen generates code that leads to panics when fields are accessed after conversion: This issue describes a bug in the conversion-gen tool where it generates incorrect conversion code for structs that have changed field types between API versions, specifically causing unsafe pointer conversions instead of proper recursive conversion calls. As a result, accessing certain fields like
ExclusiveMaximumafter conversion leads to runtime panics, highlighting the need for conversion-gen to produce safe and correct conversion functions. - Failure cluster [ff7a6495...] TestProgressNotify fails when etcd in k/k upgraded to 3.6.2: This issue describes a failure in the TestProgressNotify test that occurs when the etcd component in the Kubernetes project is upgraded to version 3.6.2. The test times out after 30 seconds waiting on a result channel, with multiple errors indicating that the embedded etcd server fails to set up serving due to closed network connections and server shutdowns.
2.3 Open Issues
This section lists, groups, and then summarizes issues that were created within the last week in the repository.
Issues Opened This Week: 23
Summarized Issues:
- Certificate Management: This issue seeks a straightforward method to manually update a near-expiring root certificate in Kubernetes, highlighting the need for better certificate lifecycle management. Without such a method, administrators may face challenges in maintaining secure cluster operations.
- issues/135321
- Conntrack Cleanup Efficiency: Starting with Kubernetes v1.32, the switch from batch deletion of conntrack entries to individual deletions via netlink API has drastically degraded cleanup efficiency. This causes prolonged rule synchronization delays and service access failures, especially under high UDP DNS query loads.
- issues/135323
- Test Failures and Performance Regressions: Multiple test failures have been reported, including ClusterLoaderV2 API responsiveness exceeding latency thresholds, flaky POD Resources API tests under CPU manager Static policy, and node-kubelet-cri-proxy-all-alpha job failures related to container restarts and image pullbacks. These failures impact reliability and performance validation in Kubernetes CI.
- issues/135326, issues/135373, issues/135374
- Version Skew and Feature Gate Issues: The GuaranteedQoSPodCPUResize feature causes inconsistent behavior due to dependency on both feature gates and node configuration, leading to regressions when upgrading clusters. This version skew is a beta-blocker for Kubernetes 1.36 and complicates cluster upgrades.
- issues/135329
- Service Object Validation and Behavior Inconsistencies: Creating a Service without a name results in invalid IP allocation due to premature ClusterIP assignment, and modifying Headless Services to regular ClusterIP Services silently fails without error. These issues cause invalid resource states and confusing user experiences.
- issues/135333, issues/135345
- Admission and Validation Improvements: Proposals include moving InPlacePodVerticalScaling feasibility checks to the admission controller to prevent infeasible requests and requests for testing utilities to better validate CEL expressions used in CRD validation and admission policies. These aim to improve request validation and testing capabilities.
- issues/135341, issues/135351
- Batch Processing Enhancements: Adding batch update functionality to the cloud node controller is proposed to improve throughput and reduce API request volume by processing multiple node status updates simultaneously. This would optimize controller efficiency and reduce load.
- issues/135344
- Error Handling and API Stability: Downstream error handler implementations have not fully adapted to prior API changes, causing incomplete error handling and potential crashes. A proposal to intentionally break the ErrorHandler API again aims to enforce proper updates and improve structured error logging.
- issues/135349
- Namespace Lookup Race Condition: A race condition in ValidatingAdmissionPolicy namespace lookup causes spurious "namespace not found" errors when creating workloads, due to informer delays and lack of fallback to live client lookups. This leads to intermittent failures in workload creation.
- issues/135352
- API Resource Definition and Code Generation Issues: Missing
k8s:optionaltags on API resource fields can cause generated code to mishandle nil pointers, leading to potential runtime errors. Proper tagging is necessary for safe code generation. - issues/135354
- Node Object Update Inefficiencies: Unsorted lists of runtime OCI handlers from CRI runtimes cause redundant Kubernetes node updates due to order changes, even when handler sets remain unchanged. This inefficiency leads to unnecessary node object updates that should be avoided.
- issues/135357
- Scheduler Preemption Test Failures: The Kubernetes scheduler's asynchronous preemption e2e test fails because low priority pods are not consistently preempted, as some pods never have their DeletionTimestamp set before test timeout. This causes unreliable test outcomes.
- issues/135370
- Kubelet and CRI-O Test Failures: The node-kubelet-serial-crio CI test fails with errors related to CRI Proxy and SSH commands, possibly due to Service Account issues, causing ongoing test instability.
- issues/135375
- Validation Option Computation Cleanup: The current computation of validation options suffers from improper overriding, inconsistent feature gate checks, and confusing interactions. A cleanup is proposed to set initial states based on feature gates and simplify logic, potentially with a new framework.
- issues/135376
- Cache Management for Persistent Volumes and Claims: Adding a persistent volume and persistent volume claim cache manager to the device plugin sync worker pool is proposed to reduce excessive API server requests during frequent resyncs. This would improve performance and prevent pod creation delays in clusters with many PVCs.
- issues/135379
- Data Race in Volume Binding Tests: A data race detected in the TestVolumeBindingRescheduling integration test is caused by concurrent access in the structured-merge-diff library. Although fixed upstream, Kubernetes requires vendoring the updated dependency to resolve this.
- issues/135384
- Persistent Volume Reclaim Policy Enhancement: A new reclaimPolicy is requested to retain PVs but remove claimRef UUIDs, allowing new PVCs with the same metadata but different UUIDs to bind existing PVs. This supports workflows like the Phoenix pattern and frequent infrastructure rebuilds.
- issues/135387
- Flaky Test Due to Timing-Dependent Errors: The TestApplyCRDuringCRDFinalization test intermittently fails because the expected error about disallowing creation during CRD finalization is not observed, instead timing out with a context deadline exceeded error. This causes unreliable test results.
- issues/135403
- Device Resource Quota Bypass Loophole: A loophole in DRA extended resource quota verification allows users to bypass quota limits by creating multiple resource claims falsely attributed to the same pod, circumventing device request restrictions imposed by system administrators.
- issues/135404
2.4 Closed Issues
This section lists, groups, and then summarizes issues that were closed within the last week in the repository. This section also links the associated pull requests if applicable.
Issues Closed This Week: 0
Summarized Issues:
As of our latest update, there were no issues closed in the project this week.
2.5 Issue Discussion Insights
This section will analyze the tone and sentiment of discussions within this project's open and closed issues that occurred within the past week. It aims to identify potentially heated exchanges and to maintain a constructive project environment.
Based on our analysis, there are no instances of toxic discussions in the project's open or closed issues from the past week.
III. Pull Requests
3.1 Open Pull Requests
This section provides a summary of pull requests that were opened in the repository over the past week. The top three pull requests with the highest number of commits are highlighted as 'key' pull requests. Other pull requests are grouped based on similar characteristics for easier analysis. Up to 25 pull requests are displayed in this section, while any remaining pull requests beyond this limit are omitted for brevity.
Pull Requests Opened This Week: 57
Key Open Pull Requests
1. [1.36] Remove intree volume plugin portworx: This pull request removes the Portworx in-tree volume plugin from Kubernetes, including the associated feature gates and dependencies on the storage.alpha.kubernetes.io/migrated-plugins annotation, marking it as deprecated and completing the migration to CSI for Portworx volumes.
- URL: pull/135322
- Merged: No
2. RFC: apimachinery + client-go + device taint eviction unit test: context-aware Start/WaitFor, waiting through channels: This pull request improves the unit tests for device taint eviction in Kubernetes by introducing context-aware Start and WaitForCacheSync methods that wait through channels instead of polling, resulting in clearer, less noisy logs, more predictable test timing, and better synchronization visibility, while also updating related client-go and apimachinery components to support these enhancements.
- URL: pull/135395
- Merged: No
3. Emit an event when the result of a probe for a container changes: This pull request introduces a new event emitted by the probe worker in Kubernetes that notifies users when the result of a container's readiness, liveness, or startup probe changes between Success and Failure states, providing contextual failure information to improve observability and address the lack of failure details in existing probe mechanisms.
- URL: pull/135401
- Merged: No
Other Open Pull Requests
- Feature gate removals: Multiple pull requests propose the removal of generally available and permanently enabled feature gates such as
AnyVolumeDataSource,HonorPVReclaimPolicy,MemoryManager,ServiceAccountTokenPodNodeInfo,CustomResourceFieldSelectors, andStructuredAuthorizationConfiguration. These removals reflect their locked status since Kubernetes versions 1.32 and 1.33 and are part of cleanup efforts documented in various KEPs.
- Etcd version updates: Two pull requests update the Kubernetes project to build and use etcd client library version 3.6.6. These updates ensure that Kubernetes integrates the latest etcd release and SDK improvements.
- Scheduler improvements and fixes: Several pull requests address scheduler-related issues including fixing a memory leak in the scheduler cache, correcting queue hint logic for inter-pod anti-affinity to avoid scheduling delays, adding a performance test for scheduling pods with anti-affinity, and introducing parallel execution of PreBind plugins to reduce binding cycle latency.
- Bug fixes and reliability improvements: Pull requests fix various bugs such as increasing CBOR array limits to prevent controller failures, fixing kube-proxy endpoint selection to prefer active endpoints, and improving the kubectl autoscale command fallback logic for CPUPercent-only configurations. These changes enhance stability and correctness in different Kubernetes components.
- Test enhancements and flakiness investigations: Some pull requests improve test reliability and clarity by injecting delays to reproduce race conditions, replacing generic contexts with specific test contexts in CEL tests, and migrating manual DeepCopy methods to auto-generated implementations to fix lint errors and prevent issues.
- Code quality and build optimizations: Pull requests include refactoring command flag implementations to improve code quality and optimizing container image builds by skipping weak dependencies and cleaning up cache directories, which reduces image size and improves build efficiency.
- Logging and message clarity improvements: One pull request improves the clarity of log messages emitted by the client-go reflector when the WatchListClient feature is disabled, reducing confusion about which client does not support WatchList semantics and the required actions.
- Automated cherry picks for fixes: Two pull requests are automated cherry picks of fixes from the main branch to release-1.33, addressing a data race in ResolverTypeProvider and spurious alpha API warning log messages caused by patch version differences.
3.2 Closed Pull Requests
This section provides a summary of pull requests that were closed in the repository over the past week. The top three pull requests with the highest number of commits are highlighted as 'key' pull requests. Other pull requests are grouped based on similar characteristics for easier analysis. Up to 25 pull requests are displayed in this section, while any remaining pull requests beyond this limit are omitted for brevity.
Pull Requests Closed This Week: 5
Key Closed Pull Requests
1. thisiasdjadfljsdf: This pull request is about adding simple text files named test.txt and bharat.txt to the Kubernetes project, as indicated by the commit messages, but it was not merged.
- URL: pull/135383
- Merged: No
2. Fix alpha API warnings for patch version differences: This pull request fixes spurious alpha API warning log messages caused by comparing full semantic versions including patch numbers by modifying the version comparison logic to only consider major and minor versions, thereby preventing unnecessary warnings when the binary and emulation versions differ solely in patch version.
- URL: pull/135327
- Merged: Yes
- Associated Commits: e08c1
3. Build etcd image with 3.6.6: This pull request proposes updating the Kubernetes master branch to build the etcd image using version 3.6.6, incorporating the latest release from the etcd project.
- URL: pull/135332
- Merged: No
- Associated Commits: 4a3c7
Other Closed Pull Requests
- Bug Fixes in Benchmark Tests: This topic covers pull requests that address issues causing failures in benchmark tests, specifically fixing a bug in the scheduler_perf benchmark where tests fail due to nil featureGates. The fix prevents a panic caused by assigning entries to a nil map, ensuring stability in test execution.
- pull/135365
- Feature Placeholder: This topic includes pull requests that serve as placeholders or templates for new features without specific implementation details or commits. These submissions are not merged and do not contribute functional changes to the project.
- pull/135382
3.3 Pull Request Discussion Insights
This section will analyze the tone and sentiment of discussions within this project's open and closed pull requests that occurred within the past week. It aims to identify potentially heated exchanges and to maintain a constructive project environment.
Based on our analysis, there are no instances of toxic discussions in the project's open or closed pull requests from the past week.
IV. Contributors
4.1 Contributors
Active Contributors:
We consider an active contributor in this project to be any contributor who has made at least 1 commit, opened at least 1 issue, created at least 1 pull request, or made more than 2 comments in the last month.
If there are more than 10 active contributors, the list is truncated to the top 10 based on contribution metrics for better clarity.
| Contributor | Commits | Pull Requests | Issues | Comments |
|---|---|---|---|---|
| pohly | 37 | 4 | 17 | 52 |
| bwsalmon | 55 | 3 | 2 | 14 |
| liggitt | 9 | 2 | 0 | 52 |
| neolit123 | 4 | 3 | 1 | 52 |
| carlory | 32 | 10 | 0 | 17 |
| macsko | 29 | 3 | 3 | 20 |
| tchap | 37 | 2 | 0 | 8 |
| michaelasp | 7 | 5 | 0 | 26 |
| BenTheElder | 11 | 2 | 0 | 25 |
| darshansreenivas | 26 | 6 | 0 | 4 |