Operations 16 min read

How Alipay Scaled Its Payment System for Double 11: Architecture, Capacity Planning, and Elastic Design

The article details how Alipay engineers tackled the massive traffic spikes of Double 11 by addressing external payment bottlenecks, implementing recharge‑based balances, building capacity‑planning platforms, adopting logical data‑center (LDC) and CRG zone architectures, deploying elastic scaling, and evolving their OceanBase database and service‑mesh infrastructure to sustain millions of transactions per second.

AntTech
AntTech
AntTech
How Alipay Scaled Its Payment System for Double 11: Architecture, Capacity Planning, and Elastic Design

Since the first Double 11 in 2011, Alipay faced external payment bottlenecks caused by limited bank‑gateway capacity, leading to transaction failures during peak moments. A workaround in 2012 redirected payments through prepaid balances, reducing reliance on external systems.

To support ever‑growing traffic, Alipay introduced capacity‑planning platforms that estimate peak QPS, decompose it into subsystem requirements, and allocate resources accordingly. This evolved into an automated, fine‑grained capacity analysis system that models each link in a business flow.

In 2013 Alipay launched the Logical Data Center (LDC) and later the CRG architecture, dividing the system into RZone (self‑contained), GZone (global shared services), and CZone (city‑level latency‑optimized services). This unit‑based design allowed horizontal scaling by adding more units without exhausting database connections.

From 2016 onward, an elastic architecture was built on top of LDC, enabling selective scaling of high‑traffic business chains, elastic storage for both streaming and state data, and middleware adaptations (routing, RPC, MQ). The elastic approach reduced idle resources and improved cost efficiency.

Database scaling remained a critical challenge. Alipay migrated to its own OceanBase distributed database, applying sharding, read/write splitting, and multi‑point writes. OceanBase 2.0 introduced a partition‑group design that achieved linear, lossless scaling, ultimately supporting millions of payments per second during Double 11.

Additional technical safeguards include full‑link stress testing, automated risk‑control inspections, a centralized promotion control system, and a service‑mesh deployment that now covers 100 % of the core payment chain.

The combined efforts in architecture, capacity planning, elastic scaling, and database innovation have enabled Alipay to reliably handle the extreme traffic peaks of Double 11 and prepare for future growth.

distributed systemsDouble 11capacity planningDatabase ScalingElastic ArchitectureAlipay
AntTech
Written by

AntTech

Technology is the core driver of Ant's future creation.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.