Cloud Native 12 min read

Achieving Zero‑Downtime Applications with Kubernetes

This article explains why and how to use Kubernetes features such as multiple pod replicas, PodDisruptionBudgets, deployment strategies, health probes, graceful termination, anti‑affinity, resource limits, and autoscaling to build zero‑downtime, highly available applications.

DevOps Cloud Academy

Aug 29, 2023

Achieving Zero‑Downtime Applications with Kubernetes

Container Image Location

If you have been using Docker for a while, pulling and using container images seems simple, but in production you often do not want to rely on remote, uncontrolled image registries for reasons such as registry disappearance, deleted tags, mutable images, and security compliance.

One solution is to sync container images from the source registry to your own registry.

Pod Count (Application Instances)

For high availability, your application needs at least two Kubernetes replicas (two Pods). Example deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
spec:
  replicas: 2 # tells deployment to run 2 pods matching the template
  template:
    ..

A common mistake is assuming a single instance is enough because Kubernetes performs rolling updates; this only applies to deployment updates, not to scenarios such as node loss or resource exhaustion, which require multiple instances to avoid downtime.

Pod Disruption Budget

PodDisruptionBudget (PDB) specifies the number of Pods that may be unavailable during maintenance, ensuring the application stays available even when some Pods are terminated.

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: my-pdb
spec:
  maxUnavailable: 1
  selector:
    matchLabels:
      app: my-app

Deployment Strategies

Kubernetes supports two deployment strategies: RollingUpdate (default) and Recreate. RollingUpdate can be tuned with maxUnavailable and maxSurge to control rollout speed under heavy traffic.

Automatic Rollback

Automatic rollback is not built‑in; it requires third‑party tools like Helm, ArgoCD, or Spinnaker. Properly configured probes ensure that a failing Pod is not exposed to traffic and can trigger a rollback.

Probes

Liveness probes verify that a container is running, while readiness probes determine if it should receive traffic. Custom probes (e.g., HTTP endpoints) are often more reliable than simple TCP checks.

Initial Startup Delay

Applications with heavy startup cost may need an increased initialDelaySeconds for liveness probes:

livenessProbe:
  initialDelaySeconds: 60
  httpGet:
    ...

Graceful Termination (terminationGracePeriodSeconds)

Graceful termination only works if the application handles SIGTERM; otherwise the process is killed abruptly, potentially causing data loss or poor user experience.

Pod Anti‑Affinity

Pod anti‑affinity prevents multiple instances of the same application from being scheduled on the same node, reducing the risk of simultaneous failure.

affinity:
  podAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
    - labelSelector:
        matchExpressions:
        - key: security
          operator: In
          values:
          - S1
      topologyKey: topology.kubernetes.io/zone
  podAntiAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
    - weight: 100
      podAffinityTerm:
        labelSelector:
          matchExpressions:
          - key: security
            operator: In
            values:
            - S2
        topologyKey: topology.kubernetes.io/zone

Resources

Insufficient memory leads to OOM kills; insufficient CPU can cause slow responses or prevent readiness probes from succeeding.

Autoscaling

Horizontal Pod Autoscaling (HPA) adds Pods based on CPU utilization (or custom metrics) to handle traffic spikes, but it must be correctly configured.

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
spec:
  ...
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50

Conclusion

Kubernetes can provide zero‑downtime deployments when applications are cloud‑native and properly configured. Key practices include running at least two instances, adding health probes, handling SIGTERM, configuring autoscaling, allocating sufficient resources, using pod anti‑affinity, and adding a PodDisruptionBudget.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Kubernetes autoscaling Zero Downtime Deployment Strategies pod disruption budget Health Probes

Written by

DevOps Cloud Academy

Exploring industry DevOps practices and technical expertise.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.