Cloud Native 20 min read

How to Deploy a Highly Available Application on Kubernetes

This article explains key Kubernetes configurations—such as pod replicas, pod anti‑affinity, deployment strategies, graceful termination, probes, resource allocation, scaling, and disruption budgets—to achieve high availability and zero‑downtime deployments for containerized applications in production.

DevOps Cloud Academy

May 6, 2024

How to Deploy a Highly Available Application on Kubernetes

Kubernetes is one of the most widely used container orchestration systems, adopted by major cloud providers such as AWS, Azure, GCP, and DigitalOcean.

Using Kubernetes involves more than just creating a cluster and deploying pods; many features improve resilience and high availability. The following topics are covered:

Pod replicas

Pod anti‑affinity

Deployment strategies

Graceful termination

Probes

Resource allocation

Scaling (VPA, HPA, Cluster Autoscaler, Karpenter)

Pod Disruption Budget

Pod Replicas

Running a single pod makes the workload unavailable if that pod fails; therefore at least two replicas are recommended. The example below shows a Deployment with two replicas.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
spec:
  replicas: 2
  template:
    spec:
      containers:
      - image: nginx:1.14.2

This creates two identical pods, ensuring the application remains reachable even if one pod fails.

Pod Anti‑Affinity

Pod anti‑affinity ensures pods are scheduled on different nodes, improving availability during node failures or upgrades. Example configuration:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-server
spec:
  selector:
    matchLabels:
      app: web-store
  replicas: 3
  template:
    metadata:
      labels:
        app: web-store
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - web-store
            topologyKey: "kubernetes.io/hostname"
      containers:
      - name: web-app
        image: nginx:1.16-alpine

The anti‑affinity rule prevents two pods from being placed on the same node, spreading them across the cluster or availability zones.

Deployment Strategies

Kubernetes supports several deployment strategies; this article focuses on RollingUpdate, which replaces old pods with new ones only after the new pods are ready, providing seamless upgrades.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 5
  selector:
    matchLabels:
      app: nginx
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2
        ports:
        - containerPort: 80

The configuration updates one pod at a time, ensuring continuous service availability.

Graceful Termination

Pods receive a SIGTERM signal and have a configurable terminationGracePeriodSeconds to allow clean shutdown. Example:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  containers:
  - name: nginx-container
    image: nginx:latest
  terminationGracePeriodSeconds: 60

This gives the pod 60 seconds to finish ongoing requests before being killed.

Probes

Kubernetes probes (readiness, liveness, startup) verify that containers are healthy and ready to receive traffic. Example configuration:

apiVersion: v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  containers:
  - name: myapp-container
    image: myapp-app:latest
    ports:
    - containerPort: 80
    readinessProbe:
      httpGet:
        path: /health
        port: 80
      initialDelaySeconds: 5
      periodSeconds: 10
    livenessProbe:
      httpGet:
        path: /health
        port: 80
      initialDelaySeconds: 15
      periodSeconds: 20
    startupProbe:
      httpGet:
        path: /health
        port: 80
      initialDelaySeconds: 10
      periodSeconds: 5

Readiness ensures the pod only receives traffic when ready; liveness restarts unhealthy containers; startup checks that the application has started correctly.

Resource Allocation

Requests and limits define minimum and maximum CPU/memory for pods, preventing resource starvation. Example:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  containers:
  - name: nginx-container
    image: nginx
    resources:
      requests:
        memory: "64Mi"
        cpu: "250m"
      limits:
        memory: "128Mi"
        cpu: "500m"

This guarantees each pod gets at least 64 MiB memory and 250 mCPU, but never exceeds 128 MiB or 500 mCPU.

Scaling

Kubernetes supports vertical and horizontal pod autoscaling as well as cluster‑level autoscaling. Examples:

Vertical Pod Autoscaler (VPA)

kubectl apply -f https://github.com/kubernetes/autoscaler/releases/download/vertical-pod-autoscaler-0.10.0/vertical-pod-autoscaler-crd.yaml
kubectl apply -f https://github.com/kubernetes/autoscaler/releases/download/vertical-pod-autoscaler-0.10.0/vertical-pod-autoscaler-deployment.yaml

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: nginx-vpa
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind: "Deployment"
    name: "nginx-deployment"
  updatePolicy:
    updateMode: "On"

Horizontal Pod Autoscaler (HPA)

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: nginx-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: nginx-deployment
  minReplicas: 2
  maxReplicas: 5
  metrics:
  - type: Resource
    resource:
      name: cpu
      targetAverageUtilization: 50

Cluster Autoscaler

helm repo add cluster-autoscaler https://kubernetes.github.io/autoscaler
helm install my-cluster-autoscaler cluster-autoscaler/cluster-autoscaler --version 9.34.1

Cluster Autoscaler adds new nodes when pods cannot be scheduled; Karpenter provides a more cost‑optimized, workload‑aware alternative.

Pod Disruption Budget

PDB ensures a minimum number of pods remain available during voluntary disruptions such as node upgrades. Example:

apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
  name: my-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: my-app

At least two pods must stay running, preventing accidental downtime during maintenance.

Conclusion

Configuring all of the above features—replicas, anti‑affinity, deployment strategies, graceful termination, probes, resource limits, autoscaling, and disruption budgets—ensures a seamless, zero‑downtime deployment of containerized applications on Kubernetes.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Cloud Native High Availability Kubernetes scaling Probes pod disruption budget

Written by

DevOps Cloud Academy

Exploring industry DevOps practices and technical expertise.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.