How Alibaba Cloud SLB Achieves High Availability Across Four Layers
This article explains Alibaba Cloud's Server Load Balancer (SLB) architecture and its four-tier high‑availability design—application processing, cluster forwarding, cross‑zone disaster recovery, and cross‑region disaster recovery—detailing both product features and user‑side best practices.
SLB Overview
Server Load Balancer (SLB) distributes traffic across multiple ECS instances to enhance service capacity and serves as the entry point for critical business systems such as Taobao, Tmall, and Alibaba Cloud, especially during traffic spikes like Double 11.
SLB Architecture
Requests first hit an SLB listener (port) on a public IP, then are forwarded to backend ECS servers. The architecture consists of a cluster deployed in two availability zones in East China 1, each containing an LVS cluster for TCP/UDP/HTTP/HTTPS traffic and a Tengine cluster for HTTP/HTTPS processing.
Four Layers of High Availability
1. Application Processing Layer
This layer runs the actual services on ECS instances. SLB ensures high availability by using health checks to detect faulty ECS nodes and by allowing multiple ECS instances, possibly across different zones, to be attached to a single SLB.
2. Cluster Forwarding Layer
The forwarding layer comprises the LVS and Tengine clusters. It avoids single points of failure by deploying clusters with ECMP routing and synchronizing sessions across LVS nodes, so traffic can continue when a node fails. Users should also implement retry logic in their applications.
3. Cross‑Availability‑Zone Disaster‑Recovery Layer
This layer ensures service continuity when an entire zone becomes unavailable. SLB deploys instances in multiple zones and uses routing priority and health checks to automatically shift traffic to the backup zone only under extreme failures. Users should distribute backend ECS across zones and provision at least two SLB instances in different primary zones for critical services.
4. Cross‑Region Disaster‑Recovery Layer
For scenarios where an entire region fails, traffic is redirected via global DNS (cloud DNS) to SLB instances in another region. This requires additional products such as Cloud DNS for global load balancing and high‑speed inter‑region links for data synchronization.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
