Weekly GitHub Report for Kubernetes: March 31, 2025 - April 07, 2025 (12:03:33)
Weekly GitHub Report for Kubernetes
Thank you for subscribing to our weekly newsletter! Each week, we deliver a comprehensive summary of your GitHub project's latest activity right to your inbox, including an overview of your project's issues, pull requests, contributors, and commit activity.
Table of Contents
I. News
1.1 Recent Version Releases:
The current version of this repository is v1.32.3
1.2 Version Information:
The version release on March 11, 2025, introduces key updates and changes to Kubernetes, as detailed in the linked changelog, with additional binary downloads available for users. Notable highlights or trends from this release can be found in the Kubernetes announcement forum and the comprehensive changelog documentation.
II. Issues
2.1 Top 5 Active Issues:
We consider active issues to be issues that that have been commented on most frequently within the last week. Bot comments are omitted.
-
DaemonSets should be scheduled before Deployments on new Nodes: This issue highlights a problem in Kubernetes where DaemonSets are not being scheduled before Deployments on new nodes, leading to resource allocation issues. The user reports that when a single node is started after a period of no nodes being available, the kube-scheduler may prioritize Deployments, leaving insufficient resources for DaemonSets, which requires manual intervention to resolve.
- The comments discuss the issue's origin and involve multiple assignment changes, with a suggestion to use Kubernetes' guaranteed scheduling for critical add-on pods. A reference is made to a related issue, and a discussion from a maintainer summit is mentioned, highlighting the importance of scheduling DaemonSets before other workloads for node readiness and application functionality.
- Number of comments this week: 10
-
Kubelet can host "phantom" pod upon etcd restore.: This issue describes a problem where the Kubelet can host a "phantom" pod after an etcd restore, causing resource allocation issues and preventing newly scheduled pods from running. The expected behavior is for the Kubelet to issue a warning and remove the "phantom" pod by reconnecting to the apiserver and treating it as an orphaned pod.
- The comments discuss potential causes and solutions, including a suggestion that the Kubelet might not be receiving the signal to delete the pod due to an issue with the watch connection. A user reports not seeing expected log entries and updates the repro script for better logging. Another suggestion is made to use
--bump-revision
during etcd restore to see if it helps the Kubelet refresh the watch connection immediately. - Number of comments this week: 4
- The comments discuss potential causes and solutions, including a suggestion that the Kubelet might not be receiving the signal to delete the pod due to an issue with the watch connection. A user reports not seeing expected log entries and updates the repro script for better logging. Another suggestion is made to use
-
[Failing test]LoadBalancers ExternalTrafficPolicy issues: This issue reports failing tests related to the LoadBalancers ExternalTrafficPolicy in the Kubernetes project, specifically affecting the master-informing job and gce-master-scale-correctness tests. The failures are due to timeouts when waiting for services to have a load balancer, as indicated by the error messages in the logs.
- The comments discuss the scalability aspect of the issue and reference a potentially related existing issue. There is a suggestion that the problem might already be tracked elsewhere, with a specific issue number mentioned for further investigation.
- Number of comments this week: 4
-
Some Network interface names are not allowed by current DRA validation: This issue addresses the problem of certain network interface names being disallowed by the current Device Resource Allocation (DRA) validation in Kubernetes, which forces network implementations to normalize these names, complicating operations that depend on them. The issue suggests relaxing the validation rules to accommodate network interface names, referencing a function that validates such names in the Linux kernel as a potential example.
- The comments involve reassigning the issue to different contributors, indicating a need to involve specific individuals in resolving the problem.
- Number of comments this week: 3
-
When I run kubectl exec, I get nginx 404 error.: This issue describes a problem encountered after upgrading Kubernetes from version v1.31.6 to v1.31.7, where executing the
kubectl exec
command on a pod results in a 404 error from an nginx server. The user is seeking advice on whether redeploying kube-proxy could resolve the issue, as the error occurs consistently across all pods on a specific node.- The comments suggest that the 404 error is likely due to an nginx instance in front of the API, not a bug in Kubernetes, and recommend upgrading the client version to match the server version and restarting kubelet.
- Number of comments this week: 2
2.2 Top 5 Stale Issues:
We consider stale issues to be issues that has had no activity within the last 30 days. The team should work together to get these issues resolved and closed as soon as possible.
- apimachinery resource.Quantity primitive values should be public for recursive hashing: This issue addresses the need for the primitive values within the
apimachinery
resource.Quantity
struct to be made public to facilitate recursive hashing by libraries such ashashstructure
, which is currently hindered by the private nature of these variables. The lack of public access to these values complicates the detection of changes in Custom Resource Definitions (CRDs) for projects likekubernetes-sigs/karpenter
, which rely on hash comparisons to identify specification drifts, impacting resource allocation and necessitating cumbersome workarounds. - APF borrowing by exempt does not match KEP: This issue highlights a discrepancy between the Kubernetes Enhancement Proposal (KEP) and its implementation regarding how the exempt priority level borrows concurrency limits from other levels. Specifically, the KEP outlines a distinct formula for calculating the minimum concurrency limit for exempt levels, which is not reflected in the current implementation, leading to potential inconsistencies in resource allocation.
- apimachinery's unstructured converter panics if the destination struct contains private fields: This issue describes a problem with the
DefaultUnstructuredConverter
in the Kubernetesapimachinery
package, where it panics when attempting to convert a destination struct that contains private fields. The panic occurs because the converter tries to set values on these non-exported fields, which is not allowed in Go, and the user expects the converter to ignore such private fields to prevent the panic. - Jsonpath impl does not support left match regex: This issue highlights a request for the addition of support for the
=~
operator in jsonpath filter expressions within a GitHub project, specifically to enable matching using Golang regular expressions. The feature is needed to simplify the process of locating desired resources among many by allowing users to perform regex-based searches, such as matching items whose descriptions start with a specific pattern, and the requester has expressed willingness to contribute to the implementation. Since there were fewer than 5 open issues, all of the open issues have been listed above.
2.3 Open Issues
This section lists, groups, and then summarizes issues that were created within the last week in the repository.
Issues Opened This Week: 20
Summarized Issues:
- Kubelet Phantom Pod Issue: The Kubelet can host a "phantom" pod after an etcd restore to a previous snapshot, causing resource allocation issues. This prevents newly scheduled pods from running, as the Kubelet does not receive the signal to delete the pod and continues to treat it as if it exists.
- Replicaset Controller Rapid Pod Creation: The Kubernetes replicaset controller rapidly creates new pods in a loop when the kubelet marks pods as failed due to being out-of-sync. This leads to a high number of pods with an "OutOfcpu" status, suggesting the need for a backoff mechanism.
- Kubernetes Upgrade and 404 Error: After upgrading Kubernetes from version v1.31.6 to v1.31.7, executing the
kubectl exec
command on a pod results in a 404 error from an nginx server. This suggests a possible misconfiguration or an unintended nginx proxy in front of the Kubernetes API server.
- Kube-proxy IPVS Mode Syncing Issue: A potential flaw in the
syncProxyRules
function of kube-proxy's IPVS mode may lead to unnecessary or incorrect updates during the initial sync phase. This is due to not verifying if the existing destination matches the IP and port of the new destination.
- Kubernetes TopologySpreadConstraints Enhancement: A proposal to add a "maxReplicas" feature to Kubernetes' TopologySpreadConstraints aims to enhance workload distribution control. This would mitigate risks associated with node loss and ensure better resource management in large multi-tenant clusters.
- Containerd Eviction Test Flakiness: The flakiness of containerd eviction tests, specifically
PriorityPidEvictionOrdering
andMemoryAllocatableEviction
, started occurring after the update to containerd 2.x. This update might be the cause, as it has already led to one test failure.
- Kubernetes Deployment Stuck After Node Shutdown: A Kubernetes deployment in the d8-system namespace becomes stuck for about 10 minutes after a graceful node shutdown and server restart. This is due to the deployment controller still considering terminated or error state pods from a previous replicaset as active.
- Kubernetes and Containerd Initialization Error: An error occurs during the initialization process with kubeadm in Kubernetes version 1.30.11 and containerd version 1.6.28. This is due to an "Unimplemented" RPC error indicating that the runtime service is unknown, suggesting that the container runtime is not running properly.
- Kubernetes Disruption Probe Proposal: A new "disruption probe" is proposed in Kubernetes to differentiate between when an application is ready to serve traffic and when it is safe to disrupt. This addresses the need for a mechanism that allows pods to be routable without being disruptable.
- Kubernetes OOM Score Adjustment Issue: The
oom_score_adj
calculation for Burstable pods does not considerPriorityClass
, leading to critical system pods being prematurely terminated by the kernel OOM killer under memory pressure. This is contrary to the intended protection for higher-priority pods likesystem-cluster-critical
.
- DaemonSets Scheduling Priority Issue: DaemonSets are not being prioritized over Deployments when scheduling on new nodes in Kubernetes, leading to resource allocation issues. This requires manual intervention to resolve, suggesting that resources for DaemonSets should be reserved before scheduling other workloads.
- Server-Side Apply No-op Calls Issue: No-op Server-Side Apply (SSA) calls on Kubernetes ClusterRoleBindings result in unnecessary updates to the
resourceVersion
andmetadata.managedFields[].time
. This causes infinite reconciliation loops in a controller-runtime based controller, despite the object remaining unchanged.
- LoadBalancers ExternalTrafficPolicy Test Failures: Failing tests related to the LoadBalancers ExternalTrafficPolicy in Kubernetes are causing specific e2e tests to time out. This affects jobs like "master-informing" and "gce-master-scale-correctness" since late March 2025.
2.4 Closed Issues
This section lists, groups, and then summarizes issues that were closed within the last week in the repository. This section also links the associated pull requests if applicable.
Issues Closed This Week: 9
Summarized Issues:
- InPlacePodVerticalScaling Flaking Tests: The Kubernetes project faces issues with the "InPlacePodVerticalScaling" feature, where tests intermittently fail due to container initialization errors and mismatched cgroup values. These problems lead to unexpected container restarts and failures in verifying resource limits, prompting discussions on improving error handling.
- Certificate and TLS Verification Errors: After updating the CA certificate using kubeadm, new nodes cannot join the Kubernetes cluster due to a TLS verification error. This "x509: certificate signed by unknown authority" error suggests a misconfiguration in the certificate update process.
- Ingress-Nginx-Controller Header Issues: Upgrading the ingress-nginx-controller and kube webhook results in a duplicate "Transfer-Encoding: chunked" header, causing a 502 error. This issue is resolved by reverting to a previous version of the ingress controller.
- Kubernetes Version and Etcd Upgrade: Updating Kubernetes version 1.33.0 to include etcd version 3.5.21 addresses an upgrade inconsistency between etcd versions 3.5 and 3.6. This update involves building and publishing the etcd image and updating the kubeadm package to prevent failures during Kubernetes upgrades.
- Cluster API Initialization Failures: The Kubernetes cluster API experiences repeated test timeouts due to the cluster not initializing or having ready replicas. This issue is highlighted by multiple recent failures in periodic end-to-end tests across different release versions.
- Broken Link in README: A broken link in the README file of a GitHub project leads to a 404 error when accessing the installation section. The issue suggests updating the link to '/scripts/install.sh' to resolve the problem.
- Namespace Deletion Context Issue: Deleting a Kubernetes namespace does not automatically switch the context back to the default namespace. This leads to confusion when running
kubectl get pods
, as it continues to search for resources in the deleted namespace, perceived as a data inconsistency.
2.5 Issue Discussion Insights
This section will analyze the tone and sentiment of discussions within this project's open and closed issues that occurred within the past week. It aims to identify potentially heated exchanges and to maintain a constructive project environment.
Based on our analysis, there are no instances of toxic discussions in the project's open or closed issues from the past week.
III. Pull Requests
3.1 Open Pull Requests
This section provides a summary of pull requests that were opened in the repository over the past week. The top three pull requests with the highest number of commits are highlighted as 'key' pull requests. Other pull requests are grouped based on similar characteristics for easier analysis. Up to 25 pull requests are displayed in this section, while any remaining pull requests beyond this limit are omitted for brevity.
Pull Requests Opened This Week: 32
Key Open Pull Requests
1. Fix goroutine leak: This pull request addresses a goroutine leak in the Kubernetes project by completing the work from a previous pull request, adding necessary tests to verify the fix, and making additional code improvements such as fixing formatting and adding a missing header, as well as updating the vendor to include goleak for better test coverage.
- URL: pull/131170
- Merged: No
2. Optimise: Use StorageClassName field first: This pull request aims to optimize the handling of persistent volume claims in Kubernetes by prioritizing the use of the storageClassName
field over the deprecated volume.beta.kubernetes.io/storage-class
annotation, ensuring better compatibility with modern Kubernetes practices.
- URL: pull/131135
- Merged: No
3. remove assert/require lib from scheduler pkg: This pull request aims to clean up the Kubernetes scheduler package by removing the use of the assert and require libraries in favor of using cmp.Equal or cmp.Diff, as recommended by the SIG (Special Interest Group) guidelines, and addresses issue #130407.
- URL: pull/131145
- Merged: No
Other Open Pull Requests
- Debug Containers and Image Pull Secrets in
kubectl
: This topic covers enhancements to thekubectl
tool, including a feature that allows users with clusterrole/edit permission to create debug containers and the addition of the--image-pull-secret-config
option to support private repositories. These changes improve the debugging capabilities by granting necessary permissions and enabling the use of images from private repositories.
- Bug Fixes in Kubernetes Components: Several pull requests address bugs in various Kubernetes components, such as preventing division by zero in the apiserver, correcting endpoint deletion on Windows nodes, and ensuring accurate handling of the
observedGeneration
field in pod resize conditions. These fixes enhance the stability and reliability of Kubernetes by addressing specific issues identified in the codebase.
- Security and Dependency Updates: Updates to dependencies and security fixes are addressed, including the update of the
golang.org/x/net
package to fix CVE-2025-22870 and CVE-2025-22872, and the update of CoreDNS to version 1.12.1. These updates ensure that the Kubernetes project remains secure and benefits from the latest improvements and fixes.
- Documentation and Code Cleanup: Enhancements to documentation and code cleanup efforts are made, such as clarifying the usage of fields in
PodSandboxStatusResponse
and migrating deprecated syscall functions. These efforts improve the clarity and maintainability of the codebase, ensuring compatibility with future updates.
- Automated Cherry Picks for Bug Fixes: Automated cherry picks of a previous fix (#131020) are applied to multiple release branches to address a race condition in the kube-apiserver. These cherry picks ensure that the fix is consistently applied across different versions of Kubernetes.
- Testing and Performance Improvements: Improvements in testing and performance include updating the
kubelet_authz
component to a new test framework and optimizing the DRA scheduler by eliminating repeated conversions. These changes enhance the efficiency and accuracy of testing and scheduling processes in Kubernetes.
- Miscellaneous Updates and Features: Various updates and features are introduced, such as updating the SIG Autoscaling maintainers list, adding an option to the scheduler-perf tool, and simplifying the etcd3 watcher. These changes contribute to the overall improvement and functionality of the Kubernetes project.
3.2 Closed Pull Requests
This section provides a summary of pull requests that were closed in the repository over the past week. The top three pull requests with the highest number of commits are highlighted as 'key' pull requests. Other pull requests are grouped based on similar characteristics for easier analysis. Up to 25 pull requests are displayed in this section, while any remaining pull requests beyond this limit are omitted for brevity.
Pull Requests Closed This Week: 20
Key Closed Pull Requests
1. bump x/net to v0.37.0: This pull request aims to update the golang.org/x/net package to version 0.37.0 as part of a cleanup effort, addressing a related issue in the Kubernetes project and including a fix for narrow spaces of %e in the x/net bump, as referenced in a Google cel-go issue.
- URL: pull/130913
- Merged: No
2. rename DeploymentPodReplacementPolicy FG to DeploymentReplicaSetTerminatingReplicas: This pull request involves renaming the feature gate from "DeploymentPodReplacementPolicy" to "DeploymentReplicaSetTerminatingReplicas" to accommodate a new status field for Deployments and ReplicaSets, allowing for the feature to be split into two separate gates for independent graduation, as discussed in previous GitHub issues and Slack conversations, and includes updates to the changelog for the 1.33 release.
- URL: pull/131088
- Merged: 2025-04-01T08:22:42Z
3. Bump etcd 3.5.21 sdk: This pull request updates the Kubernetes project by bumping the etcd client SDK to version 3.5.21 as part of a cleanup effort, addressing issue #131101, and includes commits that fix narrow spaces in the %e format for x/net, with no user-facing changes introduced.
- URL: pull/131103
- Merged: 2025-04-01T07:10:36Z
Other Closed Pull Requests
- Library Updates: This topic includes pull requests that aim to update various libraries within the Kubernetes project. One pull request attempts to bump the version of the
sigs.k8s.io/json
library to support Go 1.24, while another seeks to update the golang-jwt library to version 4.5.2 to address specific security vulnerabilities. Both pull requests were not merged.
- Bug Fixes: Several pull requests focus on addressing critical bugs in the Kubernetes project. These include fixing a race condition in the kube-apiserver, adding protection against division by zero in the ReconcileEndpoints function, and addressing a flake issue related to resourceVersion in list responses. These efforts aim to enhance the reliability and stability of the system.
- Etcd Updates and Cleanup: This topic covers pull requests related to updating and cleaning up the etcd component in the Kubernetes project. One pull request involves building the etcd 3.5.21 image to address an upgrade issue, while another focuses on cleaning up etcd version 3.6.0 in the master branch. These updates partially address issue #131101.
- Code and Test Improvements: Pull requests in this category aim to improve code organization and test efficiency. One pull request focuses on cleaning up code by organizing imports within controller packages, while another aims to parallelize cacher list tests to reduce runtime. These efforts contribute to better code maintainability and faster test execution.
- Documentation Updates: This topic includes pull requests that address documentation updates in the Kubernetes project. One pull request corrects the release notes for version 1.32, while another involves an automated cherry-pick of this change to the release-1.32 branch. These updates ensure accurate and clear information for users regarding upgrade processes.
- README Updates: Two pull requests involve updates to the README.md file in the Kubernetes project. Both pull requests were not merged into the main branch, indicating ongoing efforts to improve project documentation.
- Test Optimization: A pull request addresses the termination of the cacher in the TestGetListRecursivePrefix to close a long-running watch connection. This change significantly reduces the test runtime, although the pull request was not merged.
- Error Message Enhancement: A pull request aims to enhance the error message formatting in the
kubectl drain
command. By eliminating redundant prefixes, the pull request improves the clarity and usability of error messages for users.
- Namespace Differentiation for Tests: A pull request proposes using a namespace to differentiate between ephemeral tests in OpenShift clusters. This approach utilizes an existing ClusterRole named
proxy
for kubelet authorization, addressing a specific bug in the testing process.
3.3 Pull Request Discussion Insights
This section will analyze the tone and sentiment of discussions within this project's open and closed pull requests that occurred within the past week. It aims to identify potentially heated exchanges and to maintain a constructive project environment.
Based on our analysis, there are no instances of toxic discussions in the project's open or closed pull requests from the past week.
IV. Contributors
4.1 Contributors
Active Contributors:
We consider an active contributor in this project to be any contributor who has made at least 1 commit, opened at least 1 issue, created at least 1 pull request, or made more than 2 comments in the last month.
If there are more than 10 active contributors, the list is truncated to the top 10 based on contribution metrics for better clarity.
Contributor | Commits | Pull Requests | Issues | Comments |
---|---|---|---|---|
BenTheElder | 18 | 3 | 8 | 89 |
pohly | 32 | 4 | 6 | 38 |
thockin | 40 | 1 | 1 | 14 |
danwinship | 27 | 0 | 1 | 28 |
liggitt | 18 | 7 | 0 | 29 |
serathius | 31 | 2 | 3 | 18 |
bart0sh | 19 | 1 | 1 | 32 |
thisisharrsh | 0 | 0 | 0 | 51 |
aojea | 8 | 2 | 6 | 34 |
dims | 6 | 0 | 6 | 36 |