Cloud Native 15 min read

Case Study: Full Cloud Migration of the "Smart Citizen Pass" Mini‑Program Using Kubernetes and Cloud Services

This article details the background, initial ECS‑based deployment, performance bottlenecks during rapid user growth, and the comprehensive cloud‑native migration—including DNS load balancing, managed Kubernetes, CDN acceleration, RDS/Redis replacement, and NAS/OSS storage—resulting in significant scalability and operational improvements for the Smart Citizen Pass mini‑program.

Zhengtong Technical Team
Zhengtong Technical Team
Zhengtong Technical Team
Case Study: Full Cloud Migration of the "Smart Citizen Pass" Mini‑Program Using Kubernetes and Cloud Services

1. Project Background

During the COVID‑19 pandemic, the company quickly adapted its WeChat mini‑program "Smart Citizen Pass" in mid‑February to support epidemic control modules such as self‑check, isolation check, work‑return declaration, electronic pass, and knowledge base, providing technical support for local governments.

2. Initial Deployment Architecture

At the early stage, a traditional ECS setup with a self‑managed MySQL database, Redis cache, and SpringBoot microservices was used. When user growth was modest, performance and operational costs were acceptable.

Initial architecture diagram (image omitted).

3. Performance Issues During User Surge

From late February, user registrations surged from ~5,000 per day to over 300,000, reaching nearly 3 million total users. The following problems emerged:

Static resource overload on a single Nginx node, causing 499 errors.

Large bundled Vue JS file (≈345 KB) leading to bandwidth consumption of ~268 Mbps per request during peak.

Elastic IP bandwidth limit of 200 Mbps causing long‑running requests.

Redis memory pressure and occasional OOM kills.

JVM GC pauses and high old‑generation memory usage, causing backend latency.

Scaling ECS instances required restarts, causing service interruptions.

4. Full Cloud Migration Architecture

Given the projected QPS growth (current >1,000, expected >2,000), the team decided to migrate all services to a Kubernetes cluster within one day.

4.1 DNS‑Based Smart Load Balancing with Dual Nginx

DNS smart resolution distributes traffic across two Nginx servers, achieving horizontal scaling without a dedicated SLB. An initial SLB trial (1 Gbps, QPS limit 1,000) was dropped in favor of DNS routing, saving costs and improving performance.

4.2 Managed Kubernetes (ACK) Cluster

The Alibaba Cloud Container Service for Kubernetes (ACK) managed version was chosen. Workers are created by the user, while the control plane is fully managed, offering high availability and low operational overhead.

Cluster topology diagram (image omitted).

Allocate 4C16G or 4C32G ECS for Java containers due to high memory consumption.

Use dedicated VPC switches for POD/SVC CIDR isolation.

Prefer cloud enterprise network for cross‑VPC communication.

Avoid SLB unless QPS exceeds 10,000 to reduce hourly costs.

Consider self‑hosted EFK for log storage if cheaper than managed log service.

4.3 CDN for Static Content Acceleration

CDN "Full Site Acceleration" caches static assets at edge nodes, reducing origin load. Peak bandwidth reached ~400 Mbps, static QPS >800, and overall response times improved significantly.

CDN deployment steps (list omitted).

# Deployment environment variables
- env:
  - name: CDN_REPLACE_FILES
    value: mobile-vue/*/index.html
  - name: CDN_SERVER
    value: https://****.egova.com.cn
  - name: CDN_PREFIX
    value: EGOVA_CDN

# Script to replace CDN placeholders
function replace_cdn() {
  if [ "${CDN_REPLACE_FILES}" == "" ];then
       return 0
  fi
  [ "$(echo ${CDN_SERVER}|grep -E '^http*|^https*)'|wc -l)" -eq 0 ] && CDN_SERVER="."
  echo CDN_SERVER=${CDN_SERVER}

  [ "${CDN_PREFIX}" == "" ] && CDN_PREFIX="EGOVA_CDN"
  echo CDN_PREFIX=${CDN_PREFIX}

  for files in ${CDN_REPLACE_FILES}
  do
       sed -i "s#${CDN_PREFIX}#${CDN_SERVER}#g" /usr/share/Nginx/html/${files}
  done
}

4.4 RDS and Cloud Redis Replacement

Local MySQL and Redis were migrated to Alibaba Cloud RDS and cloud Redis using mydumper and myloader , reducing operational overhead. An RDS MySQL 8.0 bug (TempTable engine) caused a temporary error, later fixed by changing the engine to Memory.
# Export local DB
mydumper -u $u -p $password -h $host -G -E -R -B $db -o $dbname/ -v 3 -t 16
# Import to RDS
myloader -h ${pub_host} -u ${pub_user} -p ${pub_password} -B $dbname -q 25000 -d ./$dbname/ -v 3 -t 16

4.5 NAS (and OSS) for Media Storage

FastDFS was replaced with Alibaba Cloud NAS, providing high‑performance PV for Kubernetes. After CDN integration, media read pressure shifted to CDN. OSS offers lower cost (≈30% of NAS) and can also be mounted as a PV.

5. Post‑Migration Performance

≈3 million total users.

Peak DAU ~800 k.

Login page PV >2.5 million.

SMS verification requests up to 1.9 million per day.

Concurrent online users >40 k.

Peak QPS >3,500; stress‑test QPS reached 13,000.

6. Lessons Learned

Plan for sudden user spikes (e.g., pandemic) with technical reserves for rapid feature rollout.

Select cloud products that match actual traffic; SLB is unnecessary until multi‑million user scale.

Choose billing models wisely (bandwidth vs. traffic) for IP, CDN, and other services.

Kubernetes provides efficient horizontal scaling and cost savings for million‑user applications.

End

Performance Optimizationcloud migrationKubernetesRedisCDNRDS
Zhengtong Technical Team
Written by

Zhengtong Technical Team

How do 700+ nationwide projects deliver quality service? What inspiring stories lie behind dozens of product lines? Where is the efficient solution for tens of thousands of customer needs each year? This is Zhengtong Digital's technical practice sharing—a bridge connecting engineers and customers!

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.