LWKD: Week Ending August 25, 2024
Week: 2024-08-25
Developer News
KubeCon + CloudNativeCon + Open Source Summit China 2024 happened last week in Hong Kong. The event had various talks on AI, running AI workloads on Kubernetes and the CNCF ecosystem, and updates from various maintainers of different CNCF projects. There was also a keynote by Linus Torvalds. Videos will be posted in the CNCF YouTube channel soon.
Release Schedule
Next Deadline: 1.32 cycle begins, September 9
We're in the period between releases. Shadow applications for the v1.32 release team are open until September 6. The tentative dates for the v1.32 cycle are from September 9th to December 11th, 2024.
Featured PRs
#126745: Improve PVC protection controller's scalability by batch-processing PVCs by namespace & caching live pod list results [fixed dead loop issue with idle work queue]
This PR significantly enhances the scalability of the PVC Protection Controller by implementing batch processing of PVCs by namespace and caching live pod list results. It resolves a critical dead loop issue in the idle work queue and addresses performance bottlenecks in large clusters by reducing the number of API calls required for PVC deletion. As a result, the kube-controller-manager's CPU usage is optimized, ensuring more efficient and reliable operation, especially in environments with high pod and PVC churn.
KEP of the Week
KEP 3998: Job success/completion policy
This (KEP) aims to enhance Indexed Jobs by allowing custom success criteria, so a job can be marked as succeeded based on specific pod indexes, such as leader pods, rather than requiring all pods to succeed. It supports distributed computing frameworks like MPI and PyTorch, where only certain pods determine job success. The proposal does not alter the default behavior for jobs without a SuccessPolicy or extend this feature to NonIndexed
Jobs in its first iteration.
This KEP is tracked for beta release in v1.31.
Other Merges
- kubeadm now sorts the result of MergeKubeadmEnvVars, and allows mixing of flags
`--print-manifest
and--config
- Printer unit tests added for DRA resources
- transformation_operations_total metric gets additional resource label
- pkg/kubelet/cm/dra migrated to contextual logging
- Fix for estimated cost for Kubernetes defined CEL types for equals
- Common apiserver for all testcases in CEL tests
- kube-scheduler removes non-csi volumelimit plugins
- Scheduling throughput thresholds set in scheduler_perf tests
- Fix to DRA with structured params to make unschedulable pods schedulable again after ResourceSlice cluster events
- kube-proxy now uses field-selector clusterIP!=None on Services to avoid watching for Headless Services
- NominatedPodsForNode moved to scheduling queue to make the invocations more direct
- Events cached in the scheduling queue are cleared as soon as possible when SchedulerQueueingHints is enabled so that scheduler consumes less memory.
- New e2e tests for Node endpoints
Deprecated
- Graduated feature gates being removed: ValiatingAdmissionPolicy, StableLoadBalancerNodeSet, CloudDualStackNodeIPs, LegacyServiceAccountTokenCleanUp
- kubeadm removes the deprecated flag '--experimental-output'
- kubeadm removes the deprecated sub-phase of 'init kubelet-finilize' called experimental-cert-rotation
Version Updates
- corefile-migration to v1.0.24
Subprojects and Dependency Updates
- prometheus v2.54.1 allow multiple samples on same series, with explicit timestamps
- containerd v1.7.21 regenerate introspection UUID if state is empty
- grpc v1.66.1 enable EDS dualstack support by default; also v1.66.0