Vertical Performance Optimization: Load Balancing Architecture and Practices
This article explores the evolution of load‑balancing architectures from Alibaba’s early systems to modern micro‑service meshes, detailing DNS, hardware, and software solutions, common algorithms, and real‑world case studies such as Double‑11, China Railway 12306, WeChat red packets, and Douyin, highlighting performance, scalability, and reliability considerations.
Performance is king; usability and horizontal scalability must be built on solid performance foundations, especially for high‑concurrency systems.
Load Balancing from Alibaba Architecture Evolution
The evolution of Taobao’s (instant‑messaging) architecture is illustrated in the following diagrams:
Higher‑level evolutions such as middle‑platform construction and cloud migration are omitted here.
The most critical nodes in the evolution chain are cluster deployment and load balancing:
When local storage bottlenecks are solved, the next bottleneck moves to the monolithic performance of web containers, leading to the use of nginx reverse proxy for load balancing across multiple web containers.
When databases and Tomcat achieve horizontal scaling, the single nginx proxy becomes the new bottleneck, prompting the adoption of F5 or LVS to balance multiple nginx proxies.
When services expand across multiple regions and data centers, inter‑region latency becomes the bottleneck, so DNS is used for geographic load balancing.
Detailed Load Balancing Schemes
The common solutions can be summarized as:
DNS‑based load balancing
Hardware‑based load balancing (e.g., F5)
Software‑based load balancing (e.g., Nginx, Squid)
DNS Load Balancing
The DNS resolution process and load‑balancing principle are shown in the two diagrams above; DNS is simple to configure and cheap, requiring no extra development or maintenance.
However, its drawbacks are:
Multi‑level caching can delay propagation of changes.
It cannot weight servers by processing capability; it uses simple round‑robin.
Frequent small TTLs increase network traffic.
Hardware Load Balancing
F5 Network Big‑IP is a network appliance comparable to a high‑performance switch, capable of handling millions of TPS.
It offers excellent performance, rich features, and multiple algorithms, but is expensive and unsuitable for small companies.
Software Load Balancing
Software load balancers operate at various OSI layers (illustrated below):
According to the OSI model, load balancing can be classified as:
Layer‑2 (MAC‑address based)
Layer‑3 (IP‑address based)
Layer‑4 (IP + port)
Layer‑7 (URL or hostname based)
In practice, Layer‑4 and Layer‑7 are the most common.
Layer‑4 vs Layer‑7 Comparison
Layer‑4
Layer‑7
Principle
IP + port
Virtual URL or host IP
Analyzed Content
IP/TCP/UDP layers
Application‑layer data such as HTTP URI or cookies
Complexity
Simple architecture, easy management
More complex
Flexibility
Only network‑layer forwarding
Can modify all request aspects
Security
Cannot directly mitigate attacks
Easier to defend against network attacks
Efficiency
High efficiency due to low‑level processing
Higher resource consumption
Common Load‑Balancing Algorithms
Algorithm
Advantages
Disadvantages
Round Robin
Simple, efficient, balances all nodes
Performance limited by the slowest server
Random
Similar to round robin
—
Consistent Hash
Same source requests map to same node, useful for gray releases
Hotspots affect nodes; node failure impacts upstream
Weighted Round Robin
Considers server capacity, maximizes cluster performance
Hard to adjust weights dynamically in production
Dynamic connections / fastest response
Adapts in real time to node status
Increases complexity and resource usage
Generalized Load Balancing
Beyond service‑level load balancing, the concept also applies to RPC routing, middleware dispatch, elastic routing, and unit‑level routing, each with specialized algorithms such as locality‑preferred routing in RPC.
Case Studies
Alibaba Double‑11 Load Balancing
Double‑11 traffic is massive and bursty, demanding:
Excellent performance to handle spikes.
High availability to tolerate device/network jitter.
Seamless upgrades and disaster recovery.
Implementation Principles
1) Performance relies on DPDK – Alibaba’s next‑gen load balancer is built on DPDK, providing high‑throughput packet processing.
2) Handling ECMP‑induced connection drops – ECMP (Equal‑Cost Multi‑Path Routing) distributes traffic across multiple paths; session synchronization and multicast are used to avoid connection interruption during server or network failures.
Railway 12306 Load Balancing
12306 faces dynamic inventory, strong transactional consistency, multi‑dimensional data consistency, and holiday traffic spikes. Load balancing mitigates these by introducing queuing systems, dynamic flow control, and service splitting, enabling the platform to survive massive spring‑festival traffic.
WeChat Red Packet Load Balancing
During the 2017 Chinese New Year, 142 billion red packets were sent, peaking at 760 k per second. The architecture uses a three‑layer load‑balancing approach:
Entry layer: set‑based traffic splitting.
Server layer: ID‑hash routing and single‑machine queuing.
DB layer: Dual‑dimensional sharding (by ID and by day) with middleware routing.
Douyin Spring Festival Red Packet Load Balancing
Douyin adopts Service Mesh (Istio) for next‑generation micro‑service load balancing. Service Mesh separates control and data planes; Envoy sidecars act as intelligent proxies, providing observability, security, and sophisticated load‑balancing policies.
Istio Load Balancing
Istio’s data plane consists of Envoy proxies that collect telemetry, perform health checks, and route requests based on configured load‑balancing strategies.
Conclusion
This article presented four typical scenarios—network layer, architecture layer, and micro‑service evolution—to illustrate practical load‑balancing applications, hoping to aid readers in their work and studies.
Final Note (Please Support)
If this article helped you, please like, view, share, or bookmark it; your support motivates the author to keep publishing.
The author also runs a Knowledge Planet subscription offering various technical series (Spring, MyBatis, RocketMQ, etc.) with tiered pricing.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Code Ape Tech Column
Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
