How to Use Kubernetes 1.35 In‑Place Pod Resize with VPA – A Step‑by‑Step Guide
This tutorial walks you through enabling the In‑Place Pod Resize feature in Kubernetes 1.35, configuring Vertical Pod Autoscaler, deploying a sample NGINX workload, and verifying live resource adjustments without restarting pods, complete with commands, YAML manifests, and best‑practice tips.
Kubernetes 1.35 introduces the GA In‑Place Pod Resize capability, allowing CPU and memory limits of running containers to be changed without pod eviction. Combined with the Vertical Pod Autoscaler (VPA) in the new InPlaceOrRecreate mode, stateful workloads such as MySQL, Redis, and Kafka can scale dynamically without downtime.
Why the update matters
Prior to 1.35, adjusting a pod’s resources required VPA to evict the pod, wait for a new pod to start, and then resume the application—an approach that can break database connections, roll back transactions, and invalidate caches. The new feature keeps the pod UID and container ID unchanged, applies changes instantly via the /resize sub‑resource, and falls back to eviction only when the node lacks sufficient capacity.
Prerequisites
Minikube (latest version) – local Kubernetes cluster
kubectl v1.35+
Git – any version to clone the VPA repository
Allocate at least 4 CPU cores and 4 GiB memory to Minikube so the VPA components can run.
Step‑by‑step implementation
Step 1: Start a Minikube cluster
minikube start --cpus=4 --memory=4096 kubectl cluster-infoStep 2: Enable Metrics Server
minikube addons enable metrics-server kubectl top nodesSuccessful output shows CPU and memory usage.
Step 3: Install VPA components
git clone https://github.com/kubernetes/autoscaler.git cd autoscaler/vertical-pod-autoscaler ./hack/vpa-up.sh kubectl get pods -n kube-system | grep vpaYou should see three running pods: vpa-recommender, vpa-updater, and vpa-admission-controller.
Step 4: Deploy a deliberately under‑provisioned NGINX demo
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-vpa-demo
namespace: vpa-demo
spec:
replicas: 2
selector:
matchLabels:
app: nginx-vpa-demo
template:
metadata:
labels:
app: nginx-vpa-demo
spec:
containers:
- name: nginx
image: nginx:latest
resources:
requests:
cpu: "50m" # intentionally low
memory: "64Mi" # intentionally lowThe low requests give VPA room to recommend higher values.
Step 5: Create a VPA in recommendation‑only mode
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: nginx-vpa
namespace: vpa-demo
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx-vpa-demo
updatePolicy:
updateMode: "Off" # only recommend, no adjustment
resourcePolicy:
containerPolicies:
- containerName: nginx
minAllowed:
cpu: "25m"
memory: "32Mi"
maxAllowed:
cpu: "1"
memory: "512Mi"
controlledResources: ["cpu", "memory"]After 2‑3 minutes, retrieve the recommendation: kubectl get vpa nginx-vpa -n vpa-demo -o yaml The output shows a target of 25 mCPU and 256 Mi memory, indicating the original 50 m request was higher than needed.
Step 6: Switch to InPlaceOrRecreate mode
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: nginx-vpa
namespace: vpa-demo
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx-vpa-demo
updatePolicy:
updateMode: "InPlaceOrRecreate" # key mode
resourcePolicy:
containerPolicies:
- containerName: nginx
controlledValues: "RequestsAndLimits" controlledValues: RequestsAndLimitsmakes VPA adjust both request and limit while preserving their original ratio.
InPlaceOrRecreate works as follows:
Attempts in‑place resize via the /resize sub‑resource.
Pod UID stays the same; containers are recreated but the pod is not restarted.
If the node lacks resources, it falls back to eviction and recreation.
Both requests and limits are updated proportionally.
Verify the change without pod restart:
kubectl describe pod -n vpa-demo -l app=nginx-vpa-demo | grep -A 3 "Requests:"Key observations: pod Age unchanged, Restart Count is 0, and resource requests reflect VPA’s recommendation.
Step 7: Load‑test to trigger dynamic resizing
apiVersion: v1
kind: Pod
metadata:
name: load-generator
namespace: vpa-demo
spec:
containers:
- name: load-generator
image: busybox
command:
- /bin/sh
- -c
- |
while true; do
wget -q -O- http://nginx-service
doneMonitor VPA suggestions every 30 seconds:
kubectl get vpa nginx-vpa -n vpa-demo -o jsonpath='{.status.recommendation.containerRecommendations[0].target}'As load rises, CPU targets increase; when the load stops, resources gradually fall back. The pod never restarts.
Cleanup
# Delete the demo namespace
kubectl delete namespace vpa-demo
# Uninstall VPA
cd autoscaler/vertical-pod-autoscaler
./hack/vpa-down.shKey takeaways and best practices
Kubernetes 1.35 enables true in‑place resource adjustments, ideal for stateful services.
InPlaceOrRecreate first tries live resize, falling back to eviction only when necessary, preserving pod identity.
Use RequestsAndLimits to keep request/limit ratios consistent.
Suitable scenarios: databases, caches, message queues, long‑running batch jobs, and any workload sensitive to restarts.
Unsuitable for short‑lived, stateless services where traditional scaling suffices.
Ensure nodes have enough free capacity and the container runtime supports in‑place resize; validate in a test environment before production rollout.
Cloud Native Technology Community
The Cloud Native Technology Community, part of the CNBPA Cloud Native Technology Practice Alliance, focuses on evangelizing cutting‑edge cloud‑native technologies and practical implementations. It shares in‑depth content, case studies, and event/meetup information on containers, Kubernetes, DevOps, Service Mesh, and other cloud‑native tech, along with updates from the CNBPA alliance.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
