Backend Development 24 min read

From Single Server to Cloud Native: How Taobao Scaled to Millions of Concurrent Users

This article uses Taobao as a case study to illustrate the step‑by‑step evolution of server‑side architecture from a single‑machine setup to a cloud‑native, highly available system capable of handling tens of millions of concurrent requests, highlighting the technologies and design principles involved at each stage.

Programmer DD

Apr 14, 2020

From Single Server to Cloud Native: How Taobao Scaled to Millions of Concurrent Users

1. Overview

This article uses Taobao as an example to introduce the evolution of server‑side architecture from hundreds to tens of millions of concurrent users, listing the relevant technologies at each stage and summarizing architectural design principles at the end.

Special note: The example of Taobao is only for illustration and does not represent the actual technical evolution path of Taobao.

2. Basic Concepts

Before discussing architecture, the following basic concepts are introduced for readers who may be unfamiliar with them:

Distributed Multiple modules deployed on different servers, e.g., Tomcat and database on separate machines.

High Availability When some nodes fail, other nodes can take over to continue providing service.

Cluster A set of servers providing a unified service, such as Zookeeper's master‑slave deployment.

Load Balancing Distributing incoming requests evenly across multiple nodes.

Forward and Reverse Proxy Forward proxy forwards internal requests to external networks, while reverse proxy forwards external requests to internal servers.

3. Architecture Evolution

3.1 Single‑machine Architecture

In the early stage, Tomcat and the database are deployed on the same server. Users access www.taobao.com, which resolves to an IP and reaches the Tomcat.

As the number of users grows, Tomcat and the database compete for resources, and a single machine can no longer support the business.

3.2 First Evolution: Separate Tomcat and Database

Tomcat and the database each occupy dedicated servers, significantly improving their individual performance.

With more users, concurrent reads/writes to the database become a bottleneck.

3.3 Second Evolution: Introduce Local and Distributed Cache

Local cache (e.g., memcached) is added on the Tomcat server, and a distributed cache (e.g., Redis) is added externally to cache hot items or HTML pages, greatly reducing database load. Issues such as cache consistency, penetration, breakdown, avalanche, and hot‑data expiration are discussed.

Cache handles most requests, but Tomcat becomes the new bottleneck as traffic continues to grow.

3.4 Third Evolution: Reverse Proxy for Load Balancing

Multiple Tomcat instances are deployed and a reverse‑proxy software (Nginx) distributes requests evenly. Assuming each Tomcat handles 100 concurrent connections and Nginx 50 000, the system can theoretically support 50 000 concurrent users.

Reverse proxy greatly increases application concurrency, but the database becomes the new bottleneck.

3.5 Fourth Evolution: Database Read/Write Separation

The database is split into a write master and multiple read replicas, synchronized via middleware such as Mycat. For the latest data, an extra copy can be kept in cache.

Different business traffic competes for the database, affecting overall performance.

3.6 Fifth Evolution: Sharding Databases per Business

Large tables are split into smaller ones and routed by hash (e.g., by product ID or hour). Mycat supports sharding and access control. This approach increases operational difficulty and requires DBA expertise.

Such sharding leads to a distributed database architecture, often implemented as MPP (massively parallel processing) systems. Open‑source examples include Greenplum, TiDB, PostgreSQL‑XC, HAWQ; commercial examples include GBase, Snowflake‑DB, Huawei LibrA.

Both the database and Tomcat can scale horizontally, but Nginx eventually becomes the bottleneck.

3.7 Sixth Evolution: LVS or F5 for Multi‑Nginx Load Balancing

LVS (software) and F5 (hardware) provide layer‑4 load balancing for multiple Nginx instances, offering higher performance and protocol support. High availability is achieved with keepalived and virtual IPs.

In practice, groups of Nginx may serve subsets of Tomcat instances, and keepalived ensures failover.

When concurrency reaches hundreds of thousands, LVS becomes a bottleneck, and geographic latency emerges.

3.8 Seventh Evolution: DNS Round‑Robin for Inter‑Datacenter Load Balancing

DNS is configured to return multiple IP addresses, each pointing to a virtual IP in a different data center. Users are directed to different data centers via DNS polling, enabling data‑center‑level horizontal scaling.

As data and business become richer, a pure database can no longer satisfy all requirements.

3.9 Eighth Evolution: Introduce NoSQL and Search Engines

When data volume grows, traditional databases struggle with complex queries. Solutions include HDFS for file storage, HBase/Redis for key‑value, ElasticSearch for full‑text search, and Kylin/Druid for multidimensional analysis.

Adding more components increases system complexity and introduces consistency and operational challenges.

More components solve richer needs but make application upgrades harder.

3.10 Ninth Evolution: Split Large Application into Smaller Services

Code is divided by business domain, making each service clearer and independently upgradable. Shared configuration can be managed with Zookeeper.

Shared modules cause duplication and make upgrades across applications cumbersome.

3.11 Tenth Evolution: Extract Reusable Functions as Microservices

Functions such as user management, order, payment, and authentication are extracted into independent services using frameworks like Dubbo or Spring Cloud, enabling separate governance, rate limiting, circuit breaking, and degradation.

Different access methods increase complexity and coupling between services.

3.12 Eleventh Evolution: Introduce ESB to Hide Interface Differences

Enterprise Service Bus (ESB) performs protocol conversion, allowing applications and services to communicate uniformly, similar to SOA architecture.

Growing services increase deployment complexity and operational difficulty.

3.13 Twelfth Evolution: Containerization for Isolation and Dynamic Management

Docker packages applications into images, and Kubernetes (K8s) orchestrates their deployment, enabling rapid scaling and simplified operations.

Containers solve scaling but still require owned hardware, leading to low resource utilization outside peak periods.

3.14 Thirteenth Evolution: Move System to Cloud Platform

The system can be deployed on public cloud, leveraging on‑demand resources for peak traffic and releasing them afterward, reducing cost and operational burden.

IaaS Infrastructure as a Service – unified hardware resources.

PaaS Platform as a Service – common technology components.

SaaS Software as a Service – ready‑made applications.

The discussion intentionally omits cross‑datacenter synchronization, distributed transactions, and other practical challenges.

4. Architecture Design Summary

Is the architecture adjustment required to follow the above path? Not necessarily; real scenarios may need simultaneous solutions.

To what extent should the architecture be designed? Sufficient to meet current performance goals while leaving room for future growth.

Difference between service‑side architecture and big‑data architecture? Service architecture focuses on application organization, while big‑data architecture provides storage, processing, and analysis capabilities.

Design principles include:

N+1 design – no single point of failure.

Rollback design – ability to revert versions.

Feature toggle design – quickly disable problematic features.

Monitoring design – plan observability from the start.

Active‑active data center design for high availability.

Use mature technology to avoid hidden bugs.

Resource isolation – prevent one business from monopolizing resources.

Horizontal scalability – ensure the system can scale out.

Buy non‑core components when appropriate.

Use commercial hardware for reliability.

Rapid iteration – develop small features quickly for early validation.

Stateless design – services should not rely on previous request state.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Backend Architecture cloud computing Microservices Scalability caching

Written by

Programmer DD

A tinkering programmer and author of "Spring Cloud Microservices in Action"

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.