How Do Big Internet Companies Achieve Cross‑Region Multi‑Active HA?
This article analyzes the evolution of high‑availability deployment—from cold backup to cross‑region multi‑active—explaining the trade‑offs of each solution, the challenges of stateful services, and real‑world architectures used by companies like Alibaba and Eleme.
Overview
Cross‑region multi‑active (异地多活) is a high‑availability deployment model adopted by large internet companies such as Alibaba, Tencent, Baidu, NetEase, and Sina. It evolves through cold backup → dual‑machine hot standby → same‑city active‑active → cross‑city active‑active → cross‑region multi‑active.
Stateful vs Stateless Services
Stateless services achieve HA simply via load balancers (F5, etc.). Stateful services store data on disk or memory (e.g., MySQL, Redis) and require more complex solutions.
Cold Backup
Cold backup copies data files while the service is stopped. Advantages: simple, fast backup and restore, point‑in‑time recovery. Drawbacks: service downtime, data loss between backup and restore, full‑copy overhead, and inability to customize partial backups.
Dual‑Machine Hot Standby
Active/Standby mode keeps one node serving traffic while the other replicates data. Replication can be software‑based (MySQL master/slave, SQL Server transactional replication) or hardware‑based (disk mirroring). Variants include dual‑machine mutual backup, which swaps primary/secondary roles per service.
Same‑City Active‑Active
Extends hot standby within the same city, using dedicated lines for fast synchronization. It can support true active‑active reads/writes if conflict resolution is handled, but still relies on a nearby data center for disaster recovery.
Cross‑City Active‑Active (Two‑Site)
Deploys front‑end entry points and applications in two cities. When one city fails, traffic is fail‑over to the other, possibly with degraded performance due to latency.
Cross‑Region Multi‑Active Architecture
Each node has four inbound/outbound connections; any single node failure does not affect service. However, longer write latency and higher conflict probability increase complexity. Solutions include sharding by region, using distributed locks, or adopting a “Global Zone” where writes go to a master region and reads can be served locally.
Real‑World Examples
Eleme’s Global Zone, Alibaba’s three‑center model, and Taobao’s unit‑based multi‑active design illustrate different trade‑offs between consistency, throughput, and operational complexity.
Key Challenges
Increased latency for cross‑region writes.
Data conflicts requiring distributed transactions or sharding.
Higher operational overhead for testing, automation, and disaster‑recovery drills.
Discussion Questions
How would you route a user located at the intersection of four cities to the correct shard?
Which business modules in your system can be made multi‑active and which cannot?
Is multi‑active required for all services or only for core business?
References:
Eleme “Cross‑Region Multi‑Active Technical Implementation (Part 1)” – https://zhuanlan.zhihu.com/p/32009822
Eleme Architecture Blog – https://zhuanlan.zhihu.com/eleme-arch
Alibaba “Cross‑Region Multi‑Active vs Same‑City Active‑Active Architecture Evolution” – https://www.sohu.com/a/158859741_444159
Alibaba Cloud “Database Cross‑Region Multi‑Active Solution” – https://help.aliyun.com/document_detail/72721.html
“Cross‑Region Multi‑Active Is Not That Hard” – https://wely.iteye.com/blog/2313293
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
IT Architects Alliance
Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
