How to Achieve Zero‑Downtime Deployments with Kubernetes
Learn how to configure Kubernetes for zero‑downtime applications by syncing container images, ensuring multiple pod replicas, using PodDisruptionBudgets, selecting appropriate deployment strategies, setting up liveness/readiness probes, handling graceful termination, applying pod anti‑affinity, and enabling autoscaling and proper resource limits.
I have worked with both local and hosted Kubernetes clusters for over seven years, and containers have completely changed the hosting landscape, offering features like rolling restarts, zero downtime, and health checks that previously required complex setups.
Container Image Location
While pulling images is easy with Docker, in production you often do not want to rely on an uncontrolled remote registry. Risks include registry disappearance, deleted tags, mutable tags causing inconsistent behavior, and security compliance requirements.
The common solution is to sync container images from the source registry to your own private registry.
Pod Count (Application Instances)
For high availability you need at least two Kubernetes replicas (two Pods). Example deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
replicas: 2 # tells deployment to run 2 pods matching the template
template:
...Common misconception: a rolling update does not eliminate the need for multiple instances during node failures, scaling events, or when pods receive SIGTERM.
Pod Disruption Budget
A PodDisruptionBudget (PDB) limits the number of unavailable Pods during maintenance, ensuring the application stays available.
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: my-pdb
spec:
maxUnavailable: 1
selector:
matchLabels:
app: my-appDeployment Strategies
Kubernetes supports two deployment strategies:
RollingUpdate (default): updates pods gradually.
Recreate: shuts down all pods before starting new ones.
When heavy traffic loads require controlled rollout speed, you can tune maxUnavailable and maxSurge percentages.
Automatic Rollback
Kubernetes does not provide automatic rollback out of the box; you need third‑party tools like Helm, ArgoCD, or Spinnaker. Helm offers flags such as --wait, --wait-for-jobs, and --atomic to help.
Properly configured probes ensure that a failing pod triggers a rollback.
Probes
Liveness probes determine if a pod is alive; if they fail, the pod is restarted. Readiness probes control whether traffic is sent to a pod. Custom application‑level probes are often more reliable than simple TCP checks.
Initial Startup Delay
Applications that take longer to start (e.g., Java, heavy initialization, database schema loading) may need an increased initialDelaySeconds in their liveness probe.
livenessProbe:
initialDelaySeconds: 60
httpGet:
...Graceful Termination (terminationGracePeriodSeconds)
Graceful termination only works if the application handles SIGTERM. Without it, pods are killed abruptly, leading to poor user experience, data loss, or unrecoverable state. The default is 30 seconds, but you can extend it as needed.
Pod Anti‑Affinity
Pod anti‑affinity prevents multiple instances of the same application from running on the same node, reducing risk of node‑level outages. It can be soft (preferred) or hard (required).
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: security
operator: In
values:
- S1
topologyKey: topology.kubernetes.io/zone
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: security
operator: In
values:
- S2
topologyKey: topology.kubernetes.io/zoneResources
Insufficient memory leads to OOM kills; insufficient CPU can cause slow responses or failed readiness checks. Proper limits and requests are essential.
Autoscaling
Horizontal Pod Autoscaling (HPA) adds pods when CPU usage exceeds a threshold, helping avoid downtime under load.
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
spec:
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50Conclusion
Kubernetes can deliver magical reliability, but only when applications are truly cloud‑native and correctly configured. Key takeaways:
Run at least two instances.
Add health checks (probes).
Handle SIGTERM gracefully.
Configure autoscaling.
Allocate sufficient resources.
Use pod anti‑affinity.
Add a PodDisruptionBudget.
When everything is set up properly, the Kubernetes experience is seamless and downtime‑free.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
