How Ant Financial Scales to Hundreds of Thousands TPS with LDC, Unitization, and CAP Mastery
This article explains how Ant Financial’s LDC (Logical Data Center) architecture, unitized RZone/GZone/CZone design, OceanBase database, and CAP-aware strategies enable the payment platform to handle double‑11 traffic peaks of over 540,000 transactions per second while ensuring high availability, disaster recovery, and eventual consistency.
Background
Since 2008 Ant Financial’s Double‑11 payment traffic grew from 20 000 transactions per minute to a peak of 540 000 TPS in 2019 – a 1 360‑fold increase.
Problem Statement
The database layer became the bottleneck: a single DB instance cannot scale to the required TPS, and horizontal scaling of application servers alone is insufficient.
Logical Data Center (LDC) and Unitization
Ant introduced a logical‑data‑center (LDC) architecture that splits the whole system into independent units (called RZone , GZone and CZone ). Each unit contains its own application servers and a set of database shards, allowing linear capacity growth by adding more units.
Unit Types
RZone (Region Zone) : Handles user‑specific data. Data is sharded by user ID, so each RZone serves a distinct user segment.
GZone (Global Zone) : Stores globally shared data that cannot be sharded (e.g., system configuration, exchange rates). Only one logical copy exists across the LDC.
CZone (City Zone) : Optimizes data with a natural write‑read delay (e.g., user profile, membership). CZone provides local read replicas so that RZones can read without crossing regions.
Sharding Strategy
Data is partitioned horizontally by user ID (each user belongs to a specific shard) and vertically by business function (different tables are placed in different databases). This reduces the load on any single DB instance and enables independent scaling of each shard.
Traffic Routing (Spanner)
Requests first hit a custom global load balancer called Spanner . Spanner extracts the user ID, selects the target IDC, and then maps the request to the appropriate RZone. Inside the RZone the request is routed to the local DB partition that holds the user’s data.
Disaster Recovery (DR)
Three DR levels are defined:
Same‑room (intra‑machine)
Same‑city (intra‑IDC)
Cross‑region (inter‑IDC)
When a failure occurs, traffic is re‑routed to a backup unit. Before traffic switches, the DB‑partition mapping is updated so that the new unit owns the correct shards, guaranteeing data consistency.
CAP Analysis
Ant’s architecture is evaluated against the CAP theorem:
Partition Tolerance : OceanBase (the underlying DB) implements Paxos consensus. A write succeeds only when a quorum of (N/2)+1 nodes acknowledge it, allowing the system to survive network partitions.
Availability : Traffic is routed at the gateway level and each unit operates independently. Even if some partitions are down, other units continue to serve requests.
Consistency : OceanBase provides eventual consistency. During a partition only one write quorum can succeed; other nodes synchronize later.
OceanBase Database
OceanBase is a distributed relational database that:
Uses Paxos for consensus and quorum writes.
Supports dynamic role assignment – any node can become a writer or a reader.
Guarantees that a write is committed only after a quorum of replicas have persisted it, providing high availability and fault tolerance.
RZone–DB Mapping Example
The mapping between RZones and database partitions is configured in a static table. A typical configuration looks like:
RZ0* --> a
RZ1* --> b
RZ2* --> c
RZ3* --> dWhen a request arrives, Spanner uses this table to forward the request to the correct RZone, which then accesses its local partition (e.g., a for RZ0).
DR Re‑mapping Procedure
During a disaster‑recovery switch the system performs two steps:
Reassign DB‑partition ownership. For example, if RZ0 and RZ1 become unavailable, their partitions are transferred to RZ2 and RZ3 respectively:
RZ0* --> /
RZ1* --> /
RZ2* --> a
RZ2* --> c
RZ3* --> b
RZ3* --> dUpdate the user‑ID‑to‑RZone routing table so that traffic for the affected user ranges is directed to the new units. An example before the switch:
[00-24] --> RZ0A(50%),RZ0B(50%)
[25-49] --> RZ1A(50%),RZ1B(50%)
[50-74] --> RZ2A(50%),RZ2B(50%)
[75-99] --> RZ3A(50%),RZ3B(50%)After the switch the mapping becomes:
[00-24] --> RZ2A(50%),RZ2B(50%)
[25-49] --> RZ3A(50%),RZ3B(50%)
[50-74] --> RZ2A(50%),RZ2B(50%)
[75-99] --> RZ3A(50%),RZ3B(50%)This order (DB mapping first, traffic mapping second) ensures that no request is sent to a unit that does not yet own the required data, preserving stability during the cut‑over.
Key Takeaways
Unitized RZone design isolates user groups, enabling TPS to scale from tens of thousands to > 500 000 TPS.
Paxos‑based OceanBase prevents split‑brain scenarios during network partitions.
CZone local reads exploit the write‑read delay, reducing cross‑region latency for data that does not need immediate consistency.
Combined traffic routing, unitization, and multi‑level DR allow Ant’s payment system to sustain record‑breaking loads while maintaining high availability and disaster resilience.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
