Inside Alibaba’s Same‑City Active‑Active Architecture: A Complete Visual Guide

The article breaks down Alibaba’s same‑city active‑active high‑availability architecture, detailing its four design layers—traffic scheduling, stateless application services, data replication, and operational automation—while illustrating how each component ensures continuous service during data‑center failures.

Mike Chen's Internet Architecture
Mike Chen's Internet Architecture
Mike Chen's Internet Architecture
Inside Alibaba’s Same‑City Active‑Active Architecture: A Complete Visual Guide

Distributed systems are the backbone of large‑scale architectures; this article explains Alibaba’s same‑city active‑active (dual‑active) design.

In Alibaba’s high‑availability framework, same‑city active‑active refers to two relatively independent data centers located in the same city that simultaneously handle production traffic. If either center fails, the other can quickly take over, minimizing service interruption. Unlike traditional active‑passive setups, both sites remain fully operational and share load.

Alibaba’s core services such as Taobao, Tmall, and Alipay have evolved from single‑datacenter, primary‑backup disaster recovery to same‑city active‑active and eventually to multi‑city active architectures.

Architecture overview image
Architecture overview image

The same‑city active‑active architecture is built around four layers: traffic scheduling, data synchronization, stateless application design, and fault‑tolerant operations.

1. Traffic Layer (Active‑Active)

User requests first pass through a global traffic scheduling system (DNS/GSLB) that directs them to the nearest or a designated data center. Both data centers can process external requests, and traffic is split based on health checks, load conditions, and business policies. This balances pressure and enables rapid traffic migration when a center experiences an anomaly.

2. Application Layer (Active‑Active)

Application services are designed to be stateless or to have minimal state dependencies. Session data, caches, and configuration are externalized to unified storage or distributed components, ensuring that any center can handle a request without relying on local state.

3. Data Layer (Active‑Active)

This is the most challenging layer. Alibaba combines multi‑replica databases, distributed storage, message queues, and asynchronous replication to keep data consistent—or eventually consistent—across the two centers. Scenarios demanding strong consistency use stricter transaction control and arbitration mechanisms, while workloads tolerant of eventual consistency prioritize availability and throughput.

4. Operations & Disaster‑Recovery Layer

Automation handles monitoring, alerting, disaster‑recovery drills, and fault‑switching mechanisms. When an abnormality is detected, the system quickly locates the issue and redirects traffic, reducing manual intervention. Regular DR drills verify that the dual‑active link is truly usable, which is essential for practical deployment.

用户              │      DNS / GSLB调度      ↙          ↘  机房<span>A</span>          机房<span>B</span>  APP            APP  Redis          Redis  MQ             MQ  DB             DB      ↔ 数据同步 ↔

The architecture is not a simple copy of systems across two data centers; it is a coordinated design that ensures continuous service, balanced load, and data integrity through the four layers described above.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Alibabadistributed systemsHigh AvailabilityTraffic SchedulingData ReplicationActive-ActiveStateless Architecture
Mike Chen's Internet Architecture
Written by

Mike Chen's Internet Architecture

Over ten years of BAT architecture experience, shared generously!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.