How Kubernetes 1.33 Enables In‑Place Pod Resizing Without Restarts
Kubernetes 1.33 introduces in‑place vertical pod resizing, allowing administrators to adjust CPU and memory resources on running containers without restarting pods, reducing downtime for stateful workloads, improving cost efficiency, and integrating with VPA, while outlining implementation details, supported runtimes, limitations, and practical demos.
Feature Overview
Kubernetes 1.33 adds a beta feature called Pod In‑Place Vertical Scaling that lets administrators change a running container's CPU and memory limits without restarting the pod.
Value and Use Cases
This capability eliminates the need to over‑provision resources or trigger a pod restart via the traditional Vertical Pod Autoscaler (VPA), which can cause service interruptions. It is especially valuable for stateful applications, databases, and services that require high availability.
Eliminate Pod restart risk : Adjust resources without terminating the pod.
Optimize cost structure : Reduce the need for preventive over‑provisioning.
Improve availability of stateful workloads : Databases can receive more memory or CPU without downtime.
Practical Scenarios
Typical workloads that benefit include:
Database workloads : Increase memory for large analytical queries without breaking existing connections.
Node.js API services : Scale CPU and memory during traffic spikes without a restart.
Machine‑learning inference services : Allocate extra resources for larger batches or more complex models.
Service‑mesh sidecars : Adjust Envoy resources dynamically.
JVM‑based applications : Note that increasing memory limits does not automatically raise the JVM heap; additional configuration and a restart are required.
Technical Implementation
The feature relies on several Kubernetes components:
Mutable resource fields : KEP‑1287 makes resources.requests and resources.limits mutable at runtime.
Kubelet resource validation : Checks node capacity against the new request before applying the change, setting the PodResizePending condition if insufficient.
CRI interaction : Kubelet instructs the container runtime (containerd or CRI‑O) to update cgroup settings asynchronously.
Status tracking : New pod conditions PodResizePending and PodResizeInProgress report progress.
Container Runtime Compatibility
containerd (v1.6+) : Full support.
CRI‑O (v1.24+) : Full support.
Docker : Limited support and deprecated in Kubernetes.
cgroup v2 provides better memory‑reduction handling compared with cgroup v1.
Cloud Provider Support
Google Kubernetes Engine (GKE) : Feature available in the rapid channel.
Amazon EKS : Planned for the 1.33 release (May 2025).
Azure AKS : Preview version of 1.33 includes the feature.
Self‑managed clusters : Supported on any 1.33+ cluster using containerd or CRI‑O.
Limitations and Considerations
Key constraints to be aware of:
Only Linux nodes are supported.
Pods using static CPU manager policies cannot be resized.
Only CPU and memory can be adjusted; GPU and local storage are not yet supported.
QoS class (Guaranteed, Burstable, BestEffort) cannot be changed by resizing.
Reducing memory limits may cause OOM; using restartPolicy: RestartContainer is recommended.
Init and temporary containers do not support in‑place resizing.
Resize policy must be set at pod creation and cannot be modified later.
JVM‑based applications need explicit heap reconfiguration and a restart to use increased memory.
Node must have sufficient free capacity; otherwise the pod enters PodResizePending.
Adjustment latency can be several seconds due to asynchronous kubelet processing.
Scheduler does not consider ongoing resize operations, which may lead to unexpected pressure.
VPA Integration Outlook
Current VPA (Vertical Pod Autoscaler) still recreates pods for resource changes. Ongoing work (KEP‑4951 and autoscaler PR 7673) aims to integrate VPA with in‑place resizing, allowing VPA to attempt a resize first and fall back to pod recreation only when necessary.
Future Directions
Full VPA integration with automatic in‑place scaling.
Support for additional resource types such as GPU and temporary storage.
Scheduler awareness of resize operations to protect resources.
Coordination with Cluster Autoscaler for smarter node‑level scaling.
Metric‑driven resizing based on application‑level indicators (latency, queue depth).
Demo Walkthrough
The article provides a step‑by‑step demo that creates a monitoring pod, checks its initial resources, doubles CPU and memory limits via kubectl patch pod --subresource resize, verifies the changes through kubectl describe and pod logs, confirms that the restart count remains zero, and finally cleans up the test pod.
kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
name: resize-demo
spec:
containers:
- name: resource-watcher
image: ubuntu:22.04
command: ["/bin/bash", "-c", "..."]
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "128Mi"
cpu: "100m"
restartPolicy: NotRequired
resizePolicy:
- resourceName: cpu
restartPolicy: NotRequired
- resourceName: memory
restartPolicy: NotRequired
EOFSubsequent kubectl patch commands adjust the resources, and
kubectl get pod -o jsonpath='{.status.containerStatuses[0].restartCount}'confirms a restart count of 0.
Overall, in‑place pod resizing in Kubernetes 1.33 provides a powerful, low‑downtime method for vertical scaling, but administrators should be mindful of the current platform, runtime, and workload constraints before adopting it in production.
Ops Development Stories
Maintained by a like‑minded team, covering both operations and development. Topics span Linux ops, DevOps toolchain, Kubernetes containerization, monitoring, log collection, network security, and Python or Go development. Team members: Qiao Ke, wanger, Dong Ge, Su Xin, Hua Zai, Zheng Ge, Teacher Xia.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
