Elastic Scaling Architecture for a Smart Delivery System During Peak Holiday Traffic
The article describes how an operations engineer transforms a complex, multi‑language smart delivery platform into an elastic, container‑native system that automatically scales, registers, and logs services during the high‑load Chinese New Year period using Kubernetes, Docker, init containers, and a configuration center.
The smart delivery platform processes millions of orders daily, especially during Chinese New Year, and suffers from massive traffic spikes, a heterogeneous tech stack (PHP, Go, C++, Python), manual VM‑based scaling, and resource waste.
To achieve elastic scaling, the team rewrites deployment processes using Docker images. A generic Dockerfile builds containers for each service, adding code and start scripts, and compiles C++ binaries before inclusion.
For services with third‑party dependencies (e.g., PHP‑FPM, MySQL, Redis), an initContainers step copies pre‑built dependency packages into a shared volume before the main container starts:
initContainers:
- command: ["sh", "-c", "set -ex; cp -rp /opt/data/* /app/dependencies ;"]
image: dependency-image:latest
volumeMounts:
- mountPath: /app/dependencies
name: app-volumeConfiguration files are no longer stored in numerous ConfigMaps; instead a custom configuration center pulls files from a Git repository via Jenkins, serves them over HTTP, and containers download and unpack them at startup using a shell script:
# Download configuration files
wget -r -nH --no-parent http://${config_server_url}/$config_name.tar.gz -P /conf/
# Unpack if download succeeded
if [ $? -eq 0 ]; then
tar -xzf $config_name.tar.gz -C /conf/config/
rm -f $config_name.tar.gz
else
echo "Download failed"
exit 1
fiService registration is automated with Kubernetes lifecycle hooks. The pod IP is captured as an environment variable and posted to a custom agent‑service during postStart and removed during preStop :
env:
- name: POD_IP
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.podIP
lifecycle:
postStart:
exec:
command: ["/bin/sh", "-c", "curl -X POST http://agent-service --data '{ \"ip\": \"$POD_IP\", \"port\": \"80\"}'"]
preStop:
exec:
command: ["/bin/sh", "-c", "curl -X POST http://agent-service --data '{ \"ip\": \"$POD_IP\", \"port\": \"80\"}'"]To ensure registration reliability, the postStart script checks the HTTP response and forces a container restart on failure:
response=$(curl -s -o /dev/null -w "%{http_code}" -X POST http://agent-service --data "{\"ip\":\"$POD_IP\",\"port\":\"80\"}");
if [ "$response" != "200" ]; then
echo "Request failed with HTTP status $response";
exit 1;
fiLogging is collected via a sidecar filebeat container that mounts the application log directory and forwards logs to an ELK stack using a ConfigMap‑defined filebeat.yml :
containers:
- name: log-collector
image: {harbor_host}/ops/filebeat:5.6.16
args: ["-c", "/opt/filebeat/filebeat.yml", "-e"]
volumeMounts:
- name: log-volume
mountPath: /var/log/appsResource requests and limits are set to prevent any single pod from monopolizing CPU or memory, and liveness probes using TCP sockets guarantee container health:
resources:
limits:
cpu: 0.01
memory: 0.05Gi
requests:
cpu: 0.01
memory: 0.05Gi
livenessProbe:
tcpSocket:
port: 8080
initialDelaySeconds: 60
periodSeconds: 30
failureThreshold: 3After the migration, the platform automatically scales up during lunch and dinner peaks, reduces waste during off‑hours, and requires only a single command to trigger scaling, allowing the engineer to enjoy the holiday while the system handles traffic elastically.
Yum! Tech Team
How we support the digital platform of China's largest restaurant group—technology behind hundreds of millions of consumers and over 12,000 stores.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.