Weekly Project News

Subscribe
Archives

Weekly GitHub Report for Kubernetes: September 15, 2025 - September 22, 2025 (12:08:26)

Weekly GitHub Report for Kubernetes

Thank you for subscribing to our weekly newsletter! Each week, we deliver a comprehensive summary of your GitHub project's latest activity right to your inbox, including an overview of your project's issues, pull requests, contributors, and commit activity.


Table of Contents

  • I. News
    • 1.1. Recent Version Releases
    • 1.2. Other Noteworthy Updates
  • II. Issues
    • 2.1. Top 5 Active Issues
    • 2.2. Top 5 Stale Issues
    • 2.3. Open Issues
    • 2.4. Closed Issues
    • 2.5. Issue Discussion Insights
  • III. Pull Requests
    • 3.1. Open Pull Requests
    • 3.2. Closed Pull Requests
    • 3.3. Pull Request Discussion Insights
  • IV. Contributors
    • 4.1. Contributors

I. News

1.1 Recent Version Releases:

The current version of this repository is v1.32.3

1.2 Version Information:

The Kubernetes version released on March 11, 2025, introduces key updates detailed in the official CHANGELOG, with additional binary downloads available. For comprehensive information on new features and changes, users are encouraged to consult the Kubernetes announce forum and the linked CHANGELOG.

II. Issues

2.1 Top 5 Active Issues:

We consider active issues to be issues that that have been commented on most frequently within the last week. Bot comments are omitted.

  1. No longer testing CgroupV1 with fedora-coreos images: This issue addresses the problem that Fedora CoreOS images are booting with cgroupv2 by default despite configuration attempts to enable cgroupv1, causing certain Kubernetes tests intended for cgroupv2 to run unexpectedly. It highlights the upstream changes in systemd versions 257 and 258 that reduce and ultimately remove support for cgroupv1, suggesting that continued testing and support for cgroupv1 on Fedora images may no longer be viable and should be phased out.

    • The comments explain that systemd 257 relaxed cgroupv1 restrictions but does not fully honor legacy settings, while systemd 258 removes cgroupv1 support entirely; contributors agree on discontinuing cgroupv1 testing on Fedora images and propose a follow-up KEP to formally deprecate cgroupv1 support and remove related test lanes, emphasizing the need for a coordinated approach rather than piecemeal changes.
    • Number of comments this week: 9
  2. require probes: This issue discusses whether the kubectl CLI, various static analysis tools, and the kubelet should require pods to have liveness and readiness probes by default, proposing to default these probes to the common /healthz HTTP(S) endpoint. The concern is that many real-world deployments lack basic health checks, which can lead to failures, but enforcing such a requirement would be a breaking change and is currently not supported in Kubernetes core.

    • The comments clarify that kubectl does not expose probes and that not all workloads need them, with alternatives available for liveness and readiness checks. The proposal to require probes by default is seen as a breaking change unlikely to be accepted, but users can enforce probe requirements via admission policies or webhooks. Suggestions include creating a new resource type if strict enforcement is desired, and there is openness to contributing admission policies, though such efforts require maintenance and human resources. The issue is being treated as a support request rather than a feature change, with a potential new feature request encouraged for publishing recommended baseline policies.
    • Number of comments this week: 8
  3. Failure cluster [ed3d2e2b...]: must reuse one gRPC connection for service and health-monitoring calls: This issue concerns a failing end-to-end node test in Kubernetes related to dynamic resource allocation (DRA), specifically verifying that a single gRPC connection is reused for both service and health-monitoring calls. The test fails because some resource claims remain in use when the driver is uninstalled, indicating pods are not properly deleted during cleanup, and the discussion explores potential causes and fixes around pod deletion timing and test cleanup ordering.

    • The comments reveal that reverting a recent test cleanup commit and explicitly deleting test pods during the test resolves the failure, but questions remain about why pods cannot be deleted later in the deferred cleanup phase. Further investigation suggests the DRA driver might be uninstalled before pods are deleted, causing cleanup issues, and attempts to log pod status during cleanup show pods still present despite deletion calls, but no definitive root cause is identified.
    • Number of comments this week: 6
  4. get rid of old conntrack setup workaround in kube-proxy: This issue addresses the removal of an outdated workaround in kube-proxy related to setting a conntrack parameter at startup, which was originally implemented due to a kernel bug affecting hostNetwork containers with sysfs mounts. The problem is believed to be resolved in newer versions of containerd and cri-o, but since kube-proxy does not yet require these versions, the workaround remains; the issue calls for investigation to confirm if the bug still exists and to decide whether the workaround can be safely removed.

    • The comments include labeling the issue for network and kube-proxy teams and prioritizing it, a reminder to consider rootless containers and containerd support timelines, requests for clarification on what problems remain, and confirmation that the workaround only applies to fully-privileged containers after an init step, helping to clarify the scope and next steps for removing the hack.
    • Number of comments this week: 5
  5. Sig-node-cri-o Serial PSI metrics tests failing for Cgroupv2 on Fedora Coreos: This issue concerns the failure of the Sig-node-cri-o Serial PSI metrics tests specifically for Cgroupv2 on Fedora CoreOS, with the memory PSI test not reporting correctly while CPU and I/O PSI tests pass. The problem appears isolated to Fedora OS and may be related to kernel or operating system behavior, as it does not occur on Ubuntu or Containerd environments, and ongoing investigations involve collaboration with RedHat engineers.

    • The comments include acknowledgments and triage acceptance, detailed observations that memory PSI pressure is momentary and not properly reflected in PodStats, reports of OOMKills affecting pod states, clarification of the specific test focus, and coordination on issue assignment and investigation efforts.
    • Number of comments this week: 5

2.2 Top 5 Stale Issues:

We consider stale issues to be issues that has had no activity within the last 30 days. The team should work together to get these issues resolved and closed as soon as possible.

  1. Zone-aware down scaling behavior: This issue describes a problem with the horizontal pod autoscaler (HPA) scale-in behavior in a Kubernetes deployment that uses topology spread constraints to evenly distribute pods across multiple zones. Specifically, during scale-in events, the pods become unevenly distributed with one zone having significantly fewer pods than allowed by the maxSkew: 1 setting, causing high CPU usage on the lone pod in that zone and violating the expected balanced pod distribution.
  2. apimachinery's unstructured converter panics if the destination struct contains private fields: This issue describes a panic occurring in the apimachinery's DefaultUnstructuredConverter when it attempts to convert an unstructured object into a destination struct that contains private (non-exported) fields. The reporter highlights that the converter should ideally ignore these private fields instead of panicking, especially since protobuf-generated gRPC structs often include private fields that cause this failure despite the unstructured data only containing public fields.
  3. Integration tests for kubelet image credential provider: This issue proposes adding integration tests for the kubelet image credential provider, similar to the existing tests for client-go credential plugins. It suggests that since there are already integration tests for pod certificate functionality, implementing tests for the kubelet credential plugins would be a logical and beneficial extension.
  4. conversion-gen generates code that leads to panics when fields are accessed after conversion: This issue describes a bug in the conversion-gen tool where it generates incorrect conversion code for structs that have changed field types between API versions, specifically causing unsafe pointer conversions instead of properly calling the conversion functions. As a result, accessing certain fields like ExclusiveMaximum after conversion leads to runtime panics, highlighting the need for conversion-gen to produce safe and correct code to prevent such crashes.
  5. Failure cluster [ff7a6495...] TestProgressNotify fails when etcd in k/k upgraded to 3.6.2: This issue describes a failure in the TestProgressNotify test that occurs when the etcd component in the Kubernetes project is upgraded to version 3.6.2. The test times out after 30 seconds waiting on a result channel, with multiple errors indicating that the embedded etcd server fails to set up serving due to closed network connections and server shutdowns.

2.3 Open Issues

This section lists, groups, and then summarizes issues that were created within the last week in the repository.

Issues Opened This Week: 24

Summarized Issues:

  • Security Vulnerability in Kubernetes C# Client: A security vulnerability (CVE-2025-9708) exists in the Kubernetes C# client due to improper certificate validation in custom CA mode, allowing attackers to perform man-in-the-middle attacks by accepting forged certificates. This flaw potentially compromises communication with the Kubernetes API server.
  • issues/134063
  • Dependency Update Causing Compilation Errors: Updating the indirect dependency go.opentelemetry.io/auto/sdk from version 1.1.0 to 1.2.0 causes integer overflow compilation errors on certain platforms, breaking the kind-master-dependencies job in Kubernetes. This update introduces build failures that affect the project's CI pipeline.
  • issues/134066
  • Time Adjustment Impacting Kubelet Health Checks: After a manual system time adjustment due to a malfunctioning external clock source, the kubelet fails to continuously execute health check scripts, causing services to fail without automatic restart. This issue highlights the kubelet's inability to handle time jumps properly.
  • issues/134077
  • Kube-apiserver Connection Errors to Local Etcd: The kube-apiserver logs are flooded every 15 seconds with connection errors to the local etcd endpoint despite the cluster functioning normally. This occurs in Kubernetes v1.34.1 with Talos 1.11.0 and etcd 3.6.4, indicating spurious connection failures.
  • issues/134080
  • Memory Consumption from Sequential Exec Commands: Executing multiple sequential exec commands in a loop using the Kubernetes client consumes a large amount of memory, as demonstrated by a Go code example. This behavior can lead to resource exhaustion during repeated exec operations.
  • issues/134082
  • Concurrency and Data Race Issues in Kubernetes Components: Multiple concurrency problems exist, including slice operations without synchronization in the volume manager causing lost errors, and data races in e2e Job tests where parallel goroutines write to the same error variable without proper synchronization. These issues risk unstable behavior and test flakiness.
  • issues/134083, issues/134091
  • Dynamic Resource Allocation Bugs and Improvements: The dynamic resource allocation system incorrectly skips devices that have consumed capacity but still have remaining resources, preventing multiple allocations despite allowMultipleAllocations being enabled. Additionally, the DRA plugin does not reuse a single gRPC connection properly, causing test timeouts, and driver names are case-sensitive but lack documentation warnings, leading to user confusion.
  • issues/134100, issues/134125, issues/134131
  • Dynamic Resources Scheduler Enhancement Proposal: A proposal exists to add a generic scorer for all dynamic resources in the dynamic resources scheduler plugin to improve resource binpacking, replacing the current DRA extended resources scorer in the noderesources plugin. This aims to enhance scheduling efficiency.
  • issues/134135
  • Cgroupv2 and Fedora CoreOS Testing Issues: The Sig-node-cri-o serial PSI metrics tests fail on Fedora CoreOS due to kernel or OS-specific problems with memory pressure PSI metrics under cgroupv2. Fedora CoreOS images default to cgroupv2 despite attempts to enable cgroupv1, prompting discussions about deprecating cgroupv1 support and related tests.
  • issues/134141, issues/134142
  • Kube-proxy Conntrack and IPVS Issues: Kube-proxy prematurely flushes conntrack entries for pods marked ready=false, disrupting valid UDP flows like SIP over UDP, and fails to remove stale IPVS virtual server entries outside excludeCIDRs, causing persistent invalid entries and potential traffic drops. These bugs affect network reliability and service continuity.
  • issues/134143, issues/134147
  • Conntrack Proxy Test Flakiness Due to Keying Bug: The conntrack proxy test incorrectly reinjects invalid TCP packets for closed connections because the server uses an IP-only key in its pending connection map. This causes flaky test failures that can be fixed by keying the map with both IP and client port.
  • issues/134116
  • Removal of Outdated Kube-proxy Workaround: An outdated kube-proxy workaround for a kernel bug affecting conntrack parameter settings at startup may no longer be necessary due to fixes in newer container runtimes like containerd 2.0.3+ and cri-o. The issue seeks to confirm if the bug still exists to safely remove the workaround.
  • issues/134108
  • Kubernetes Probe and Admission Plugin Issues: Kube-probe sends a non-RFC-compliant Host header containing only a port number during HTTP probes instead of including the Pod IP as hostname. Also, the Mutating Admission Plugin rejects pods with duplicate environment variable names due to parse errors, despite such pods being valid without MAP or with a Validating Admission Plugin.
  • issues/134170, issues/134167
  • Proposal to Standardize Liveness and Readiness Probes: There is a proposal to require liveness and readiness probes by default in Kubernetes tools, standardizing them to the common /healthz HTTP(S) endpoint to improve deployment reliability. Concerns about breaking changes have led to suggestions for alternative enforcement via admission policies or webhooks.
  • issues/134119
  • Etcd Image Missing etcdutl Tool: The Kubernetes etcd image from version 1.34 onwards lacks the etcdutl tool, preventing users from running etcdctl restore within the image as before. This omission requires adding etcdutl to restore previous functionality without external tools.
  • issues/134148
  • Kubelet Container Creation Retry Bug After Clock Shift: After a node clock shifts backwards during a container restart triggered by a liveness failure, the kubelet retries container creation with the same attempt number, causing containerd to fail reserving the container name due to conflicts with the existing container. This leads to container creation failures.
  • issues/134153
  • High Flake Rate in Horizontal Pod Autoscaling E2E Test: The pull-kubernetes-e2e-autoscaling-hpa-cpu test job experiences a high flake rate due to unexpected values exceeding thresholds during autoscaling validation. This has caused persistent test failures since September 18, 2025.
  • issues/134171
  • SIG Node E2E Test Feature Usage Cleanup: There is a need to review and clean up feature usage in SIG Node end-to-end tests to ensure proper categorization, replacement with FeatureGates when necessary, documentation for special environments, and passing features as arguments for test lanes to enable automated validation.
  • issues/134172

2.4 Closed Issues

This section lists, groups, and then summarizes issues that were closed within the last week in the repository. This section also links the associated pull requests if applicable.

Issues Closed This Week: 2

Summarized Issues:

  • Resource Management Failures: This topic covers issues related to resource exhaustion and cleanup failures that impact system stability and test environment preparation. One issue involves the kubelet logging errors about orphaned pod volume paths causing blocked threads and high resource usage, while another describes a failing test due to inability to acquire Google Cloud project resources from Boskos after a problematic update. Both highlight challenges in managing and cleaning up resources effectively to maintain system health and test reliability.
  • [issues/134075, issues/134115]

2.5 Issue Discussion Insights

This section will analyze the tone and sentiment of discussions within this project's open and closed issues that occurred within the past week. It aims to identify potentially heated exchanges and to maintain a constructive project environment.

Based on our analysis, there are no instances of toxic discussions in the project's open or closed issues from the past week.


III. Pull Requests

3.1 Open Pull Requests

This section provides a summary of pull requests that were opened in the repository over the past week. The top three pull requests with the highest number of commits are highlighted as 'key' pull requests. Other pull requests are grouped based on similar characteristics for easier analysis. Up to 25 pull requests are displayed in this section, while any remaining pull requests beyond this limit are omitted for brevity.

Pull Requests Opened This Week: 60

Key Open Pull Requests

1. WIP: DRA: device taints: new ResourceSlice API, new features: This pull request implements new features and an updated ResourceSlice API for device taints in the Dynamic Resource Allocation (DRA) scheduler, including changes to feature gate handling, allocator selection, and test coverage, as described in the related Kubernetes enhancement proposal.

  • URL: pull/134152
  • Merged: No
  • Associated Commits: 3872f, a52cc, f3d5e, 32dd1, 55739, f6dac, 472f6, bc37a, f1006, 4bd34

2. [WIP] fix unit test for watchlist : This pull request is a work-in-progress aimed at fixing unit tests related to the watchlist functionality in the Kubernetes client-go tools by defining new synthetic methods and addressing unsupported watchlist semantics.

  • URL: pull/134180
  • Merged: No
  • Associated Commits: 684a4, 6b249, 328bd, 3d8ad, afe7b, c4549, cb466, 30e85, 801a6, fb474

3. Fix/remove duplicate workflow: This pull request addresses the removal of duplicate CodeQL workflows in the Kubernetes repository by consolidating and optimizing the security scanning workflow to improve scan coverage, reduce runtime through caching, and ensure accurate analysis of the Go codebase with a manual build step, while eliminating previous workflow failures and maintaining flexibility for future enhancements.

  • URL: pull/134123
  • Merged: No
  • Associated Commits: 9d9a2, bb6d0, 65ed8, 811bd, 1dd19, 7ceb9, 8e77a, 7eae2

Other Open Pull Requests

  • Feature Gate Updates: Several pull requests focus on updating and graduating feature gates within Kubernetes. These include graduating the "Pop from backoffQ" feature to GA by locking the feature flag enabled and proposing the removal of the deprecated RootlessControlPlane feature gate following the GA of UserNamespacesSupport.
    [pull/134070, pull/134178]
  • Device Resource Allocation (DRA) Enhancements: Multiple pull requests address improvements and bug fixes related to Device Resource Allocation. They fix a bug allowing allocation of the same device with multiple enabled features, add scoring for extended resources backed by DRA in the scheduler, and support virtualized PCI device topologies for correct device attribute resolution.
    [pull/134103, pull/134058, pull/134074]
  • Code Quality and Refactoring: Several pull requests improve code quality by addressing linting issues, replacing deprecated functions, and cleaning up unused code. These include replacing string comparisons with strings.EqualFold, removing deprecated WaitForServiceEndpointsNum calls, and cleaning up unused functions in autoscaling.
    [pull/134096, pull/134176]
  • Testing and Validation Improvements: Some pull requests enhance testing and validation processes. These include updating ConfigMap update tests to include conformance labels, adding declarative validation for ResourceClaim status fields, and fixing race conditions by waiting for quota usage before PVC creation.
    [pull/134062, pull/134113, pull/134087]
  • Kubelet and Controller Enhancements: Pull requests improve kubelet and controller components by adding asynchronous node status updates, improving tracing with error recording, and replacing reflector run calls to support contextual logging.
    [pull/134175, pull/134060, pull/134163]
  • System Validators Pinning: Multiple pull requests pin the system-validators component to specific versions across different release branches to address false failures caused by version-specific cgroup kernel configuration checks on cgroup v2 systems.
    [pull/134084, pull/134086, pull/134088]
  • Horizontal Pod Autoscaler (HPA) Improvements: Pull requests improve HPA by adding a new metric to track the number of HPAs and replacing error handling functions to improve consistency.
    [pull/134140, pull/134149]
  • Miscellaneous Bug Fixes and Improvements: Other pull requests fix concurrency bugs in volume management, update image pull policy behavior to prevent unnecessary pulls, and correct kube-proxy nftables mode interface name matching.
    [pull/134076, pull/134090, pull/134092]
  • Code Generation Automation: One pull request enhances the sample-controller by adding code generation tools to automate registration and OpenAPI spec file creation, improving maintainability.
    [pull/134081]
  • Feature Gate State Management: A pull request extends the feature gate system by adding Snapshot and Restore methods to support test isolation and deterministic behavior.
    [pull/134149]
  • Memory Testing Development: A work-in-progress pull request focuses on developing and fixing a test case related to memory testing within Kubernetes.
    [pull/134064]

3.2 Closed Pull Requests

This section provides a summary of pull requests that were closed in the repository over the past week. The top three pull requests with the highest number of commits are highlighted as 'key' pull requests. Other pull requests are grouped based on similar characteristics for easier analysis. Up to 25 pull requests are displayed in this section, while any remaining pull requests beyond this limit are omitted for brevity.

Pull Requests Closed This Week: 37

Key Closed Pull Requests

1. Stream test refactor: This pull request refactors the streaming unit tests in the staging/src/k8s.io/apiserver/pkg/util/proxy package by centralizing the test server setup and consolidating helper functions to improve test stability, reduce code duplication, and adopt a more robust testing pattern, while maintaining the same test coverage and verifying stability through extensive stress and race condition testing.

  • URL: pull/134056
  • Merged: No
  • Associated Commits: 04266, 4828c, b1b19, bb8e2

2. Enable Declarative Validation for resource.k8s.io v1/v1beta1/v1beta2: This pull request enables declarative validation generation for the resource.k8s.io API group versions v1, v1beta1, and v1beta2 by adding +k8s:validation-gen tags and updating the ResourceClaim strategy to use the generated validation code, along with introducing declarative validation testing while not migrating any existing validation rules in this change.

  • URL: pull/134072
  • Merged: Yes
  • Associated Commits: 380c4, 7c45b, c0fcb, eca1c

3. Fix issue 133185: This pull request addresses Kubernetes issue #133185 by adding the missing /livez and /readyz health endpoints to kube-proxy's HTTP server, enhancing the /statusz endpoint to automatically display all available paths including /healthz, /livez, /readyz, and /metrics, and introducing test coverage to verify these changes.

  • URL: pull/134059
  • Merged: No
  • Associated Commits: 60113, 8c57d, 0e692

Other Closed Pull Requests

  • Go and Dependency Updates: Several pull requests focus on upgrading the Go version and managing dependencies in the Kubernetes project. These include bumping to Go 1.25.1 and 1.25, updating system-validators dependencies, pinning specific dependencies to stable versions, and adding dependency files to approvers lists to ensure proper management.
    • pull/134095, pull/134068, pull/134121, pull/134067, pull/134079, pull/134094, pull/134120
  • Validation and Declarative Validation Enhancements: Multiple pull requests improve validation mechanisms, including adding declarative validation support for DeviceClass API, introducing a new case-insensitive validation format, skipping redundant validation in Dynamic Resource Allocation, and adding reviewers for validation testing. These changes enhance validation efficiency and coverage in Kubernetes APIs.
    • pull/134078, pull/134085, pull/134089, pull/134093
  • Context and Cancellation Improvements: Several pull requests refactor components to use context.Context for cancellation and logging, replacing legacy stop channel patterns. This includes changes in the apiextensions-apiserver controller and cloud-specific node lifecycle and IPAM controllers, improving shutdown logic and contextual logging support.
    • pull/134061, pull/134097
  • Kubeadm Feature and Phase Updates: One pull request graduates the ControlPlaneKubeletLocalMode feature gate to GA, locks it enabled by default with opt-out, and modifies control-plane-join phases by deprecating and replacing subphases, making some steps permanent. These changes stabilize and improve kubeadm join workflows.
    • pull/134106
  • Storage and Key Schema Consistency: Pull requests unify directory protection for recursive storage requests and ensure consistent key schema requirements between cacher and etcd3 components. These changes improve security, reliability, and maintainability of storage-related code.
    • pull/134065, [pull/134067](https://github.com/kubernetes/kubernetes/pull/134067]
  • Kube-proxy nftables Fixes: Two automated cherry picks fix the kube-proxy nftables mode by correcting the use of iifname metadata for input interface name matches, resolving issues with local source traffic identification on nodes.
    • pull/134114, pull/134117
  • Quota and PVC Race Condition Fix: One pull request addresses race conditions between persistent volume claim creation and quota reporting by ensuring quota usage is reported before PVC creation, preventing related failures.
    • pull/134071
  • Code Refactoring and Cleanup: Several pull requests focus on code organization improvements, including extracting functions for maintainability, and proposing but not merging changes related to resourcePrefix handling and kubelet panic simulation.
    • pull/134104, pull/134109, pull/134110, pull/134111
  • Testing and Experimental Work: One pull request is a work-in-progress testing new kube-cross and go-runner images, while another is an unmerged initial attempt to add tests, indicating ongoing experimentation and test development efforts.
    • pull/134073, pull/134069

3.3 Pull Request Discussion Insights

This section will analyze the tone and sentiment of discussions within this project's open and closed pull requests that occurred within the past week. It aims to identify potentially heated exchanges and to maintain a constructive project environment.

Based on our analysis, there are no instances of toxic discussions in the project's open or closed pull requests from the past week.


IV. Contributors

4.1 Contributors

Active Contributors:

We consider an active contributor in this project to be any contributor who has made at least 1 commit, opened at least 1 issue, created at least 1 pull request, or made more than 2 comments in the last month.

If there are more than 10 active contributors, the list is truncated to the top 10 based on contribution metrics for better clarity.

Contributor Commits Pull Requests Issues Comments
pohly 44 10 6 64
BenTheElder 9 3 1 62
liggitt 5 1 1 58
pacoxu 11 7 1 39
dims 16 7 7 27
serathius 18 9 1 25
huww98 23 5 0 20
bart0sh 10 7 1 20
p0lyn0mial 35 2 0 0
jpbetz 20 2 0 13

Don't miss what's next. Subscribe to Weekly Project News:
Powered by Buttondown, the easiest way to start and grow your newsletter.