How to Deploy a Highly Available Application on Kubernetes
This article explains key Kubernetes configurations—such as pod replicas, pod anti‑affinity, deployment strategies, graceful termination, probes, resource allocation, scaling, and disruption budgets—to achieve high availability and zero‑downtime deployments for containerized applications in production.
Kubernetes is one of the most widely used container orchestration systems, adopted by major cloud providers such as AWS, Azure, GCP, and DigitalOcean.
Using Kubernetes involves more than just creating a cluster and deploying pods; many features improve resilience and high availability. The following topics are covered:
Pod replicas
Pod anti‑affinity
Deployment strategies
Graceful termination
Probes
Resource allocation
Scaling (VPA, HPA, Cluster Autoscaler, Karpenter)
Pod Disruption Budget
Pod Replicas
Running a single pod makes the workload unavailable if that pod fails; therefore at least two replicas are recommended. The example below shows a Deployment with two replicas.
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
replicas: 2
template:
spec:
containers:
- image: nginx:1.14.2This creates two identical pods, ensuring the application remains reachable even if one pod fails.
Pod Anti‑Affinity
Pod anti‑affinity ensures pods are scheduled on different nodes, improving availability during node failures or upgrades. Example configuration:
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-server
spec:
selector:
matchLabels:
app: web-store
replicas: 3
template:
metadata:
labels:
app: web-store
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- web-store
topologyKey: "kubernetes.io/hostname"
containers:
- name: web-app
image: nginx:1.16-alpineThe anti‑affinity rule prevents two pods from being placed on the same node, spreading them across the cluster or availability zones.
Deployment Strategies
Kubernetes supports several deployment strategies; this article focuses on RollingUpdate, which replaces old pods with new ones only after the new pods are ready, providing seamless upgrades.
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 5
selector:
matchLabels:
app: nginx
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80The configuration updates one pod at a time, ensuring continuous service availability.
Graceful Termination
Pods receive a SIGTERM signal and have a configurable terminationGracePeriodSeconds to allow clean shutdown. Example:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
containers:
- name: nginx-container
image: nginx:latest
terminationGracePeriodSeconds: 60This gives the pod 60 seconds to finish ongoing requests before being killed.
Probes
Kubernetes probes (readiness, liveness, startup) verify that containers are healthy and ready to receive traffic. Example configuration:
apiVersion: v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
containers:
- name: myapp-container
image: myapp-app:latest
ports:
- containerPort: 80
readinessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 15
periodSeconds: 20
startupProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 10
periodSeconds: 5Readiness ensures the pod only receives traffic when ready; liveness restarts unhealthy containers; startup checks that the application has started correctly.
Resource Allocation
Requests and limits define minimum and maximum CPU/memory for pods, preventing resource starvation. Example:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
containers:
- name: nginx-container
image: nginx
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"This guarantees each pod gets at least 64 MiB memory and 250 mCPU, but never exceeds 128 MiB or 500 mCPU.
Scaling
Kubernetes supports vertical and horizontal pod autoscaling as well as cluster‑level autoscaling. Examples:
Vertical Pod Autoscaler (VPA)
kubectl apply -f https://github.com/kubernetes/autoscaler/releases/download/vertical-pod-autoscaler-0.10.0/vertical-pod-autoscaler-crd.yaml
kubectl apply -f https://github.com/kubernetes/autoscaler/releases/download/vertical-pod-autoscaler-0.10.0/vertical-pod-autoscaler-deployment.yaml apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: nginx-vpa
spec:
targetRef:
apiVersion: "apps/v1"
kind: "Deployment"
name: "nginx-deployment"
updatePolicy:
updateMode: "On"Horizontal Pod Autoscaler (HPA)
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: nginx-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx-deployment
minReplicas: 2
maxReplicas: 5
metrics:
- type: Resource
resource:
name: cpu
targetAverageUtilization: 50Cluster Autoscaler
helm repo add cluster-autoscaler https://kubernetes.github.io/autoscaler
helm install my-cluster-autoscaler cluster-autoscaler/cluster-autoscaler --version 9.34.1Cluster Autoscaler adds new nodes when pods cannot be scheduled; Karpenter provides a more cost‑optimized, workload‑aware alternative.
Pod Disruption Budget
PDB ensures a minimum number of pods remain available during voluntary disruptions such as node upgrades. Example:
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: my-pdb
spec:
minAvailable: 2
selector:
matchLabels:
app: my-appAt least two pods must stay running, preventing accidental downtime during maintenance.
Conclusion
Configuring all of the above features—replicas, anti‑affinity, deployment strategies, graceful termination, probes, resource limits, autoscaling, and disruption budgets—ensures a seamless, zero‑downtime deployment of containerized applications on Kubernetes.
DevOps Cloud Academy
Exploring industry DevOps practices and technical expertise.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.