Backend Development 21 min read

How Taobao Scaled from 100 to Millions of Concurrent Users: A Step‑by‑Step Architecture Evolution

This article uses Taobao as a case study to illustrate how a web service evolves from a single‑machine setup to a cloud‑native, micro‑service architecture capable of handling tens of millions of concurrent requests, detailing each technical milestone and the principles behind the design choices.

21CTO

Jun 3, 2019

How Taobao Scaled from 100 to Millions of Concurrent Users: A Step‑by‑Step Architecture Evolution

1. Overview

This article uses Taobao as an example to describe the evolution of server‑side architecture from a hundred concurrent users to tens of millions, listing the technologies encountered at each stage and summarizing architectural design principles.

2. Basic Concepts

Distributed : Multiple modules deployed on different servers, e.g., Tomcat and database on separate machines.

High Availability : The system continues to provide service when some nodes fail.

Cluster : A group of servers providing a unified service, with automatic failover.

Load Balancing : Distributing incoming requests evenly across multiple nodes.

Forward and Reverse Proxy : Forward proxy lets internal systems access external networks; reverse proxy forwards external requests to internal servers.

3. Architecture Evolution

3.1 Single‑Machine Architecture

Initially, Tomcat and the database are deployed on the same server. As user numbers grow, resource competition makes this setup insufficient.

3.2 First Evolution: Separate Tomcat and Database

Tomcat and the database each occupy dedicated servers, significantly improving performance, but database read/write becomes the bottleneck as traffic increases.

3.3 Second Evolution: Local and Distributed Caching

Introduce local cache (e.g., memcached) and distributed cache (e.g., Redis) to store hot items and HTML pages, reducing database load. Issues such as cache consistency, penetration, and avalanche are addressed.

3.4 Third Evolution: Reverse Proxy Load Balancing

Deploy multiple Tomcat instances behind a reverse proxy (Nginx or HAProxy). This raises the concurrent capacity dramatically, but the database becomes the new bottleneck.

3.5 Fourth Evolution: Database Read/Write Separation

Separate the database into read replicas and a single write master, using middleware such as Mycat to synchronize data and handle sharding.

3.6 Fifth Evolution: Business‑Based Database Sharding

Store different business data in separate databases to reduce contention; high‑traffic services can be allocated more servers.

3.7 Sixth Evolution: Splitting Large Tables

Hash‑based routing splits large tables (e.g., comments, payments) into many smaller tables, enabling horizontal scaling. This leads to a distributed database architecture often implemented with Mycat.

Open‑source MPP databases such as Greenplum, TiDB, PostgreSQL‑XC, and commercial ones like GBase provide SQL‑compatible distributed query execution.

3.8 Seventh Evolution: LVS/F5 for Multi‑Nginx Load Balancing

When Nginx becomes a bottleneck, layer‑4 load balancers like LVS (software) or F5 (hardware) distribute traffic across many Nginx instances, with keepalived providing high availability.

3.9 Eighth Evolution: DNS Round‑Robin Across Data Centers

Configure DNS to return multiple IPs, each pointing to a different data‑center, achieving data‑center‑level load balancing and horizontal scaling to tens of millions of concurrent users.

3.10 Ninth Evolution: NoSQL and Search Engines

Introduce HDFS for file storage, HBase/Redis for key‑value data, Elasticsearch for full‑text search, and Kylin/Druid for multidimensional analysis to handle massive data and complex queries.

3.11 Tenth Evolution: Splitting Monolith into Small Applications

Divide the system by business domains, allowing independent deployment and scaling; shared configuration can be managed via Zookeeper.

3.12 Eleventh Evolution: Extracting Reusable Functions as Microservices

Common functionalities (user management, order, payment, authentication) become independent services accessed via HTTP, TCP, or RPC, using frameworks like Dubbo or Spring Cloud for governance.

3.13 Twelfth Evolution: Enterprise Service Bus (ESB) for Unified Access

ESB abstracts protocol differences, enabling applications and services to communicate uniformly, representing a SOA architecture that overlaps with microservices.

3.14 Thirteenth Evolution: Containerization

Docker packages applications into images; Kubernetes orchestrates dynamic deployment, enabling rapid scaling for peak events and isolation of runtime environments.

3.15 Fourteenth Evolution: Cloud Platform Adoption

Deploy the system on public cloud (IaaS, PaaS, SaaS) to leverage elastic resources, reducing hardware costs and simplifying operations.

4. Architecture Design Summary

Architecture adjustments need not follow a fixed order; they should address the most pressing bottlenecks first.

Design depth depends on system goals: meet current performance targets while leaving room for future growth.

Service‑side architecture differs from big‑data architecture, which focuses on data ingestion, storage, and analysis.

Key design principles include N+1 redundancy, rollback capability, feature toggles, monitoring, multi‑active data centers, mature technology adoption, resource isolation, horizontal scalability, buying non‑core components, using commercial hardware, rapid iteration, and stateless services.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

backend architecture Microservices scalability High concurrency Cloud

Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.