From Single Server to Cloud Native: How Taobao Scaled to Millions of Requests
This article walks through the step‑by‑step evolution of a high‑traffic e‑commerce backend—from a single‑machine setup to distributed caching, load‑balancing, database sharding, microservices, and finally cloud‑native deployment—highlighting the key technologies and design principles at each stage.
Overview
The article uses Taobao as a concrete example to illustrate how a service architecture evolves from handling a few hundred requests to supporting tens of millions of concurrent users, summarizing the technologies encountered at each stage and concluding with a set of architectural design principles.
Note: The Taobao example is illustrative only and does not reflect the actual evolution path of Taobao.
Basic Concepts
Distributed system: modules deployed on different servers (e.g., Tomcat and database on separate machines).
High availability: the system continues to serve when some nodes fail.
Cluster: a group of servers providing a unified service, with automatic failover.
Load balancing: evenly distributing requests across multiple nodes.
Forward and reverse proxy: forward proxy handles outbound traffic for internal services; reverse proxy forwards inbound traffic to internal servers.
Architecture Evolution
1. Single‑machine architecture
Initially Tomcat and the database run on the same server. As traffic grows, resource contention makes this setup insufficient.
2. First evolution – Separate Tomcat and database
Tomcat and the database are deployed on different machines, dramatically improving the performance of each component. The new bottleneck becomes database read/write concurrency.
3. Second evolution – Add local and distributed cache
Introduce local cache (e.g., memcached) and a distributed cache (Redis) to store hot product data and HTML pages, reducing database load. Issues such as cache consistency, penetration, breakdown, and avalanche are discussed.
4. Third evolution – Reverse proxy for load balancing
Deploy multiple Tomcat instances behind a reverse‑proxy layer (Nginx or HAProxy). This raises the overall request capacity but shifts the bottleneck back to the database.
5. Fourth evolution – Database read/write separation
Split the database into a write master and multiple read replicas, using middleware such as Mycat to manage routing and synchronization.
6. Fifth evolution – Business‑level database sharding
Store different business data in separate databases to reduce contention. Cross‑business queries require additional solutions.
7. Sixth evolution – Split large tables
Hash‑based or time‑based sharding of large tables (e.g., comments, payment logs) enables horizontal scaling. The approach introduces operational complexity and leads to distributed‑database architectures (MPP).
8. Seventh evolution – LVS/F5 for multi‑Nginx load balancing
Use Layer‑4 load balancers (LVS software or F5 hardware) to distribute traffic among multiple Nginx instances, with keepalived providing virtual IP failover.
9. Eighth evolution – DNS round‑robin across data centers
Configure DNS to return multiple IPs, each pointing to a different data‑center’s virtual IP, achieving inter‑data‑center load balancing.
10. Ninth evolution – Introduce NoSQL and search engines
When relational databases can no longer satisfy complex queries or massive data volumes, adopt technologies such as HDFS, HBase, Redis, Elasticsearch, Kylin, or Druid for specific workloads.
11. Tenth evolution – Split monolith into smaller applications
Separate the system by business domains, allowing independent development and deployment. Shared configuration can be managed with Zookeeper.
12. Eleventh evolution – Extract common functionality into microservices
Encapsulate shared services (user management, order, payment, authentication) as independent microservices using frameworks like Dubbo or Spring Cloud, adding service governance, rate limiting, and circuit breaking.
13. Twelfth evolution – Introduce an Enterprise Service Bus (ESB)
Use an ESB to unify protocol conversion and reduce coupling, effectively implementing a SOA architecture that overlaps with microservices.
14. Thirteenth evolution – Containerization
Package services as Docker images and orchestrate them with Kubernetes, enabling rapid scaling, isolation, and simplified operations.
15. Fourteenth evolution – Move to a cloud platform
Deploy the system on public cloud (IaaS/PaaS/SaaS), leveraging elastic resources, managed services, and on‑demand scaling to handle peak traffic while reducing operational costs.
Architecture Design Summary
N+1 design to avoid single points of failure.
Rollback capability for safe upgrades.
Feature toggles for quick disabling of problematic components.
Built‑in monitoring from the design phase.
Multi‑active data‑center deployment for high availability.
Prefer mature, well‑supported technologies.
Resource isolation to prevent one business from monopolizing resources.
Horizontal scalability as a core requirement.
Purchase non‑core components when development cost is high.
Use commercial‑grade hardware for reliability.
Rapid iteration with small, testable features.
Stateless service interfaces.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architects' Tech Alliance
Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
