How We Migrated a High‑Traffic Business Gateway to Kubernetes with Kong
This article details the step‑by‑step evolution of a company's business gateway from a single Nginx instance to a cloud‑native, Kubernetes‑based Kong deployment, covering the initial architecture, identified risks, custom controller implementation, performance testing, graceful shutdown, and smooth traffic switching strategies.
Background and Initial Architecture
The business gateway was originally a single Nginx instance on one server, forwarding traffic to backend services running in Docker containers on separate ECS instances. Route changes required manual edits to Nginx configuration files on the host.
Problems and Risks
Single‑point‑of‑failure of Nginx and insufficient capacity for growing traffic.
Backend services proliferated; Docker updates and scheduling were performed manually via Ansible, becoming cumbersome.
Frequent route changes forced engineers to log into servers and edit configuration files, leading to non‑traceable and error‑prone operations.
Architectural Evolution
To mitigate the first issue, Nginx was placed in an Auto Scaling Group (ASG) behind an SLB. Backend services were migrated to Kubernetes, exposed via NodePort Services, and deployed with ArgoCD for self‑service delivery. Because Nginx configuration was file‑based, a more automated routing solution was needed.
Kong API Gateway (built on OpenResty) replaced Nginx. Kong stores routes in a database and provides a RESTful API for configuration, enabling automated updates. A custom Kubernetes controller watches Ingress and Service resources, generates routing information, and calls Kong’s API to keep routes in sync.
Deployment Workflow
Developers modify a Jsonnet‑based tool to generate the required Kubernetes yaml (including Ingress and NodePort Service) and push the changes to GitLab.
The push triggers a webhook that calls the ArgoCD API.
ArgoCD applies the yaml to the target cluster, deploying the application.
The custom controller detects the new Ingress/Service, extracts annotations, host and path, and creates the corresponding Kong route via the Kong API.
Kong updates its routing table without manual file edits.
Why a Custom Controller?
Multiple Kubernetes clusters are used for disaster recovery. The custom controller not only updates routes but also handles service circuit‑breaking and cross‑cluster traffic shifting, providing extensibility beyond a standard Ingress controller.
Operational Pain Points After Kong in ASG
ASG scaling takes 2–5 minutes, causing latency during traffic spikes.
Updating Kong configuration required rebuilding EC2 images and rolling out the ASG, which is heavyweight.
Rollback was slow because a new image had to be built and deployed.
Update strategies lacked fine‑grained control.
Kong Migration to Kubernetes
By leveraging Kubernetes Deployments and Horizontal Pod Autoscaler (HPA), Kong can scale more responsively. The design was guided by three questions:
Will the call‑chain latency increase?
Can Kong‑proxy pods stop gracefully?
How to achieve a smooth traffic switch?
Latency Benchmark
Seven‑day continuous load testing showed identical P95 and P99 latency (0.004 s–0.005 s) for Kong running in ASG and in Kubernetes, indicating no adverse impact from the migration.
Graceful Shutdown of Kong Pods
Kong pod termination follows the standard Kubernetes pod lifecycle:
Pod is marked Terminating and removed from Service endpoints.
The preStop hook runs (can invoke a custom command).
Kubernetes sends a SIGTERM signal to containers.
Kubernetes waits for the termination grace period (default 30 s, configurable via terminationGracePeriodSeconds).
If containers are still running after the grace period, a SIGKILL is sent.
Kong provides a built‑in quit command that can be called from the preStop hook:
Usage: kong quit [OPTIONS]
Gracefully quit a running Kong node (Nginx and other configured services) in given prefix directory.
Options:
-p,--prefix (optional string) prefix Kong is running at
-t,--timeout (default 10) timeout before forced shutdown
-w,--wait (default 0) wait time before initiating the shutdown
$ kong quit -p ${PREFIX_DIR} -t ${TIMEOUT}Example Deployment YAML configuring the hook and graceful‑shutdown parameters:
apiVersion: apps/v1
kind: Deployment
metadata:
name: kong-proxy
namespace: kong-proxy
spec:
replicas: 2
selector:
matchLabels:
app: kong
name: kong-proxy
template:
metadata:
annotations:
prometheus.io/scrape: "true"
labels:
app: kong
name: kong-proxy
spec:
containers:
- name: kong-proxy
image: our.registry.com/fake-repo-here/custom-kong-docker:version
imagePullPolicy: Always
command: ["start_command"]
args: ["some", "args", "here"]
ports:
- name: http-proxy
containerPort: 80
readinessProbe:
exec:
command: ["kong_health_check.sh"]
initialDelaySeconds: 8
livenessProbe:
exec:
command: ["kong_liveness_check.sh"]
periodSeconds: 10
lifecycle:
preStop:
exec:
command: ["kong", "quit", "-p", "/kong/prefix/path", "-t", "2700"]
terminationGracePeriodSeconds: 3600
nodeSelector:
app: kong
tolerations:
- key: app
operator: Equal
value: kong
effect: NoScheduleRunning at least two Kong‑proxy pods is essential; otherwise, when a single pod terminates, new traffic would be routed to a pod that is still starting, causing 5xx errors.
Smooth Traffic Switching
To switch traffic without changing the entry point, a LoadBalancer‑type Service with SLB annotations is used. Adjusting the SLB weight allows gradual traffic migration between the ASG deployment and the Kubernetes deployment, and enables rapid rollback.
# Add Kong pods to the same virtual server group as the ASG deployment
apiVersion: v1
kind: Service
metadata:
annotations:
service.beta.kubernetes.io/alibaba-cloud-loadbalancer-id: lb-loadbalancer-id-that-currently-used
service.beta.kubernetes.io/alibaba-cloud-loadbalancer-vgroup-port: rsp-virtual-server-port:80
service.beta.kubernetes.io/alibaba-cloud-loadbalancer-weight: "10"
labels:
app: kong-proxy
name: kong-proxy
name: kong-proxy
namespace: kong-proxy
spec:
ports:
- name: http-proxy
port: 80
protocol: TCP
targetPort: 80
selector:
app: kong
name: kong-proxy
type: LoadBalancerConclusion
The business gateway has been fully migrated to a Kubernetes‑based Kong deployment. CronHPA pre‑scales Kong pods before traffic peaks, Cluster Autoscaler adds nodes when needed, and GitOps with ArgoCD automates image builds and rolling updates. Deployments are now safer, more flexible, and easier to maintain. Future work includes replacing Kong with an Istio Gateway to improve observability.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
