Alipay’s Double 11 Architecture: Logical Data Centers, Distributed Transactions, and High‑Availability Strategies
The article details Alipay’s comprehensive architecture for the Double 11 shopping festival, covering its three‑layer IAAS/PAAS/SAAS model, logical data‑center design, multi‑active disaster‑recovery, blue‑green deployment, distributed data sharding, transaction processing, and the Ant Credit Pay service’s performance and risk‑control mechanisms.
Every year the "Double 11" shopping festival becomes a massive e‑commerce event, and for engineers it serves as a rigorous test of overall architecture, middleware, operations tools, and team capabilities.
The article first outlines Alipay’s three‑layer platform: an IAAS layer providing scalable infrastructure (network, storage, databases, virtualization), a PAAS layer offering elastic distributed transaction processing and middleware, and a SAAS layer delivering high‑availability payment services and an open development platform.
To handle the ever‑growing traffic, Alipay adopts a logical data‑center architecture that partitions the system into independent units, each with isolated real‑time data, controlled inter‑unit communication, and asynchronous messaging, enabling N+1 disaster‑recovery and horizontal scalability across regions.
Key benefits of this design include reduced cross‑unit interaction, support for multi‑active (active‑active) deployments, high availability without single points of failure, and unified traffic control that facilitates online stress testing, flow control, and gray‑release mechanisms.
Since 2013 the logical data‑center framework has been in production, and by 2015 it supported a true multi‑active architecture where each geographic IDC can serve live traffic and instantly fail over to a standby IDC.
Building on this foundation, Alipay implements blue‑green (gray) releases by splitting each logical data‑center into A and B sub‑centers, routing traffic probabilistically, and gradually shifting load from blue to green after verification.
The distributed data architecture processes up to 85,900 transactions per second during Double 11, employing vertical sharding by business type, horizontal sharding by customer request, and read‑write separation with data replication.
Three primary database clusters support the transaction system: a main transaction cluster, a consumption‑record cluster, and a merchant‑query cluster, each horizontally partitioned and equipped with standby and fail‑over nodes for sub‑second recovery.
To preserve ACID guarantees while achieving high availability, Alipay designed a flexible distributed transaction framework based on an optimized two‑phase commit (TCC) model, reducing the prepare phase, improving fault tolerance, and ensuring eventual consistency through asynchronous reliable messaging.
Critical components include asynchronous reliable message strategies, automatic retry and compensation mechanisms, and integration with the financial cloud’s fail‑over capabilities.
The article also showcases the Ant Credit Pay (蚂蚁花呗) service, which achieved a 99.99% success rate and 0.035 s average latency during Double 11, scaling from 10 TPS at launch to 21,000 TPS on the peak day, thanks to the same cloud‑native, multi‑active infrastructure.
Risk control for Ant Credit Pay runs parallel credit‑risk models within 20 ms, and funding is secured via an asset‑securitization platform that pools loan assets and issues tradable securities, supporting both consumer credit demand and financing for small enterprises.
In conclusion, Alipay’s success stems from three pillars: strategic architecture design (the "plan"), robust middleware and infrastructure (the "tools"), and seasoned engineers (the "troops"), emphasizing that practical implementation and experienced operations outweigh theoretical design alone.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
