How Alibaba Scaled Its E‑Commerce Platform: From LAMP to Distributed Architecture
This article traces Alibaba's e‑commerce technical evolution—from early LAMP‑based monoliths through Java‑centric 2.0 systems to a massive multi‑IDC, service‑oriented distributed architecture with HSF, Pandora, and advanced database sharding solutions—highlighting the challenges and innovations that enabled its massive scalability.
Part 1: Technical Architecture Evolution
Preface
Alibaba is a Java powerhouse; its e‑commerce platform has grown enormously, and its underlying technology is equally impressive. Interviewers often say that Alibaba's current technical path is the future for many internet companies.
1.1 Alibaba Business Overview
1.3 Middleware Technology Overview
2.1 Architecture Evolution History
1.0 → 2.0: LAMP to monolithic Java for performance.
2.0 → 3.0: Monolith to large‑scale distributed architecture for efficiency.
3.0 → 4.0: Single IDC to multi‑IDC for capacity and stability.
2.2 Early Taobao – 1.0 LAMP Architecture
2.3 Growing Taobao – Basic Java 2.0 Architecture
2.4 Traffic‑Induced Pain Points
2.5 New Architecture
2.6 High Development & Maintenance Cost
As the site grew, the team (≈500 engineers) faced increasing complexity, large WAR files, data silos, and insufficient elasticity leading to single‑point failures.
2.7 Database Issues
During Double‑11 traffic spikes, connection pools were exhausted, requiring 24‑hour DB monitoring and manual restarts. This drove the third architectural revolution: application splitting (3.0).
Part 2: Distributed Architecture
Preface
With traffic problems exposed, Alibaba engineers prepared a technical revolution to handle the massive Double‑11 load.
1 Application Splitting
1.1 System Specialization
QianDaoHu project: Transaction Center (TC), Category Attribute Center (Forest)
WuCaiShi project: Shop Center (SC), Product Center (IC), Review Center (RC)
New organizational structure supports these services.
1.1 Service Center Teams
User Center (UIC) launched in 2008
Middleware Team
Vertical Product Teams
2 Distributed Architecture Components
2.1 HSF
Remote calls between two application clusters, transparent to the caller like local method invocation.
2.2 Pandora
Isolates dependencies between middleware and between middleware and applications, providing lifecycle management.
2.3 Data Usage
Over 60,000 production nodes use HSF and Pandora, handling more than 1 trillion requests daily.
3 Database Splitting
3.1 Vertical Splitting
Large‑scale business‑level partitioning (e.g., Product Center, User Center) gradually migrating to MySQL.
3.2 Horizontal Splitting
Data sharded to different nodes according to fixed rules.
3.3 Read‑Write Separation
Master‑slave setup provides disaster recovery.
4 Distributed Databases
4.1 TDDL (CORONA)
Horizontal sharding, read‑write separation, strong consistency.
4.2 Jingwei / Yugong
One‑to‑many data distribution and synchronization, smooth scaling of relational databases.
4.3 Statistics
Over 70,000 production nodes use TDDL, handling more than 1 trillion database calls daily; Jingwei processes over 100 billion incremental data items per day.
Jingwei synchronizes transaction data between buyers and sellers.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITFLY8 Architecture Home
ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
