Load Balancing in High‑Concurrency Scenarios: Alibaba Double 11, 12306 Railway, WeChat Red Packets, and Douyin Spring Festival Gala
This article examines real‑world load‑balancing implementations for ultra‑high traffic cases such as Alibaba's Double 11 shopping festival, China's 12306 railway ticketing system, WeChat's red‑packet service, and Douyin's Spring Festival gala, highlighting architectural principles, DPDK usage, ECMP routing, session synchronization, SET‑based sharding, and service‑mesh techniques.
Alibaba Double 11 Load Balancing
The Double 11 shopping event generates massive, pulse‑like request spikes that test every service in Alibaba's ecosystem. The load‑balancer must deliver excellent performance, high availability, and seamless upgrade or failover without affecting business.
Excellent performance to handle pulse traffic.
High service stability to tolerate device and network jitter.
Business‑transparent upgrades and disaster‑recovery switches.
Implementation Principle
1) Performance relies on DPDK – Alibaba’s next‑generation load balancer is built on DPDK, which provides packet‑level high‑performance support, enabling the system to sustain the pressure of Double 11 traffic.
2) Handling ECMP reselection‑induced connection interruptions – ECMP (Equal‑Cost Multi‑Path Routing) maximizes the use of shortest paths. Horizontal cluster deployment with identical routes creates ECMP routes at the switch level, achieving high availability. However, before session synchronization, a server failure triggers ECMP reselection, causing existing connections to switch to another server and resulting in user‑visible interruptions. Alibaba’s SLB uses session synchronization via multicast to solve long‑connection breaks during upgrades and disaster‑recovery.
12306 Railway Load Balancing
China’s 12306 ticketing platform faces dynamic inventory, strong transactional consistency, multi‑dimensional data consistency across online and offline channels, and massive traffic spikes during holidays.
The article focuses on the role of load balancing during traffic surges.
Architecture evolution shows that the original system suffered from full‑link bottlenecks caused by concurrent queries, leading to AS overload, web overload, and eventual chain‑reaction failures. After the first optimization, a queuing system with per‑train queues and dynamic traffic control was introduced, splitting traffic and achieving load balancing. The second optimization added further scaling to handle continued growth.
WeChat Red‑Packet Load Balancing
In 2017, WeChat processed 142 billion red‑packet transactions on Chinese New Year’s Eve, peaking at 760 k requests per second.
Key characteristics
Each group red‑packet behaves like a flash‑sale, demanding extremely high concurrency.
Financial nature requires strict consistency and high security.
Vertical SET‑based sharding – By routing all requests for the same red‑packet to the same SET, the massive traffic is partitioned similarly to a reduce operation, dramatically reducing inter‑SET resource pressure.
Server‑layer request queuing – Requests are hashed by ID to a specific server, then queued locally, guaranteeing ordered arrival at the database and eliminating massive lock contention.
Dual‑dimensional database/table design – Besides sharding by red‑packet ID, tables are split by day (hot/cold data) to maintain DB performance, with a middleware handling routing.
Douyin Spring Festival Gala Load Balancing
The article discusses the use of next‑generation micro‑service technology Service Mesh for load balancing in Douyin’s Spring Festival gala.
What is Service Mesh? It abstracts away the complexities of distributed systems, allowing developers to focus on business logic.
Istio’s load balancing – Istio separates control and data planes; the data plane consists of Envoy sidecars that manage all network traffic, perform health checks, and apply load‑balancing policies.
Envoy acts as a proxy, providing security, privacy, and load‑balancing capabilities, while also discovering cluster members and routing requests based on health status.
Conclusion
The article presents four typical cases—Alibaba Double 11, 12306 railway, WeChat red packets, and Douyin Spring Festival gala—to illustrate practical load‑balancing techniques across network, architecture, and micro‑service layers, aiming to help readers improve their own systems.
Reference
[1] Supporting Double 11 high‑performance load balancing: http://www.aliyunhn.com/Home/Article/detail/id/1643.html
[2] Understanding DPDK: https://cloud.tencent.com/developer/article/1198333
[3] DPDK technology overview: https://www.jianshu.com/p/86af81a10195
[4] Architecture optimization of 12306 ticketing system: Railway Computer Applications Journal
[5] Design of billion‑level WeChat red‑packet system: https://www.infoq.cn/article/2017hongbao-weixin
[6] Behind Douyin Spring Festival gala: https://www.volcengine.cn/docs/6360/67383
[7] Douyin Spring Festival gala billion‑interaction analysis: https://www.163.com/dy/article/G5N0AFOF0511FQO9.html
[8] What is Service Mesh: https://zhuanlan.zhihu.com/p/61901608
[9] Service Mesh Istio architecture analysis: https://developer.aliyun.com/article/759790
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
