Cloud Native 14 min read

How to Prevent Traffic Loss During Kubernetes Deployments: Practical Cloud‑Native Solutions

This article examines why traffic loss occurs during Kubernetes application releases, breaks down the three main traffic paths—LoadBalancer Service, Nginx Ingress, and microservice registry—and offers concrete, cloud‑native configurations and lifecycle hooks to achieve loss‑free deployments.

Alibaba Cloud Native

Nov 4, 2022

How to Prevent Traffic Loss During Kubernetes Deployments: Practical Cloud‑Native Solutions

Problem Overview

During application releases, traffic loss is a common issue that manifests as sudden spikes in response time (RT), increased 500 errors, or degraded user experience. Even with careful scheduling and limited release windows, the risk cannot be fully eliminated, especially for K8s workloads managed by EDAS.

Traffic Path Analysis

LoadBalancer Service Traffic

The traffic flow involves an external LoadBalancer and the node‑level ipvs/iptables rules managed by kube-proxy and the cloud‑controller‑manager (CCM). When a new Pod becomes ready, it is added to the Service Endpoints; terminating Pods are removed. The CCM updates the cloud provider’s load balancer backend accordingly.

External traffic policy influences how CCM and kube-proxy update rules:

Local mode : Only the node hosting the target Pod is added to the load‑balancer backend, so traffic is routed directly to that node’s Pods.

Cluster mode : All nodes are added, allowing traffic to be forwarded to Pods on any node.

Nginx Ingress Traffic

Ingress Controllers act as reverse proxies. They watch Endpoint changes and update Nginx upstream configurations so that traffic is forwarded to the correct Pod IPs. Delays in updating the upstream list can cause requests to hit terminating Pods.

Microservice Registry Traffic

In a service‑mesh style deployment, a registry (e.g., Dubbo) holds Provider addresses. When a Provider Pod starts, it registers its IP; when it stops, it deregisters. Consumers cache the address list, and stale caches can cause traffic to be sent to terminated Pods.

Root Causes and General Solutions

Traffic loss stems from mismatched timing between Pod lifecycle events and routing rule updates. Two categories are identified:

Upstream loss : New Pods receive traffic before they are fully ready.

Downstream loss : Old Pods are removed from routing tables after they have already stopped receiving traffic.

Upstream Loss Mitigation

Use readiness probes to ensure a Pod is only added to Service Endpoints after it passes health checks. Example readiness probe for Spring Boot:

readinessProbe:
  httpGet:
    path: /actuator/health/readiness
    port: ${server.port}
  ...

For microservice scenarios, enable Dubbo’s delayed registration and warm‑up features so that a Provider does not become discoverable until after an initialization period.

dubbo:
  provider:
    warmup: 120000
    delay: 5000

Downstream Loss Mitigation

Configure preStop hooks to delay SIGTERM, allowing in‑flight requests to finish and giving the control plane time to update routing tables.

lifecycle:
  preStop:
    exec:
      command:
        - /bin/sh
        - -c
        - curl http://localhost:54199/offline; sleep 30;

Additional tactics per traffic path:

LoadBalancer Service : Set externalTrafficPolicy=Cluster to keep traffic flowing through all nodes, or use pod‑local upgrades with node‑affinity to avoid cloud‑provider LB updates.

Nginx Ingress : Enable the annotation nginx.ingress.kubernetes.io/service-upstream=true so the Ingress routes to the Service’s stable ClusterIP instead of changing Pod IPs.

Microservice Registry : Ensure the Provider deregisters before termination and that Consumers refresh their server list, e.g., by invoking http://localhost:54199/offline in preStop.

EDAS Integrated Solutions

EDAS provides a non‑intrusive, console‑driven way to achieve loss‑free deployments without modifying application code. Features include:

Configurable external traffic policy for LoadBalancer Services.

Ingress annotation management for service‑upstream routing.

Dubbo delay registration and warm‑up parameters.

Pre‑stop hook configuration via the EDAS UI.

Combined with observability tools (Ingress monitoring, application metrics) and deployment strategies (canary, batch releases), EDAS enables seamless, zero‑loss rollouts across diverse traffic paths.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Cloud Native kubernetes preStop Hook Readiness Probe EDAS Traffic Loss

Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.