Backend Development 21 min read

From Single Server to Cloud‑Native: Taobao’s 14‑Step Architecture Evolution

This article uses Taobao as a case study to trace the evolution of its server‑side architecture from a single‑machine setup to a cloud‑native, micro‑service ecosystem, detailing each scaling milestone, the technologies involved, and the design principles that guide high‑availability, high‑concurrency systems.

Architecture Talk

Jun 30, 2019

From Single Server to Cloud‑Native: Taobao’s 14‑Step Architecture Evolution

Overview

The article uses Taobao as an example to illustrate the evolution of server‑side architecture from handling a hundred concurrent users to tens of millions, listing the technologies encountered at each stage and summarizing key architectural design principles.

Basic Concepts

Distributed : Multiple modules deployed on different servers, e.g., Tomcat and database on separate machines.

High Availability : System continues to provide service when some nodes fail.

Cluster : A group of servers acting as a single service, with automatic failover.

Load Balancing : Evenly distributing requests across multiple nodes.

Forward and Reverse Proxy : Forward proxy accesses external networks on behalf of internal systems; reverse proxy forwards external requests to internal servers.

Architecture Evolution

3.1 Single‑Machine Architecture

Initially, Tomcat and the database run on the same server. Users access www.taobao.com, which resolves to an IP and reaches the Tomcat instance.

As user count grows, Tomcat and the database compete for resources, and a single machine cannot sustain the load.

3.2 First Evolution: Separate Tomcat and Database

Tomcat and the database are deployed on separate servers, significantly improving performance of each component.

Database read/write becomes the bottleneck as user numbers increase.

3.3 Second Evolution: Introduce Local and Distributed Caches

Local cache (e.g., memcached) is added within Tomcat, and a distributed cache (Redis) is deployed externally to store hot product data and HTML, reducing database load and addressing cache consistency, penetration, and avalanche issues.

3.4 Third Evolution: Reverse Proxy for Load Balancing

Multiple Tomcat instances are deployed behind a reverse‑proxy such as Nginx, which distributes requests evenly. Technologies include Nginx and HAProxy, with considerations for session sharing and file uploads.

Reverse proxy greatly increases application concurrency, but the database becomes the next bottleneck.

3.5 Fourth Evolution: Database Read‑Write Separation

Separate read and write databases; read replicas synchronize from the primary. Middleware like Mycat manages read/write splitting and sharding, handling data consistency.

Different business modules compete for database resources, affecting performance.

3.6 Fifth Evolution: Business‑Level Sharding

Data for different business domains is stored in separate databases, reducing resource contention. Large tables are split into smaller ones, enabling horizontal scaling.

Single‑machine write databases eventually hit performance limits.

3.7 Sixth Evolution: Split Large Tables

Large tables are partitioned (e.g., by hash or time) into many small tables, allowing parallel processing. MPP databases such as Greenplum, TiDB, and PostgreSQL‑XC provide distributed query execution.

Both Tomcat and database can scale horizontally, but Nginx eventually becomes the bottleneck.

3.8 Seventh Evolution: LVS/F5 for Multi‑Nginx Load Balancing

LVS (software) or F5 (hardware) load balancers operate at layer 4, distributing traffic among multiple Nginx instances, achieving higher concurrency and high availability via keepalived.

When concurrency reaches hundreds of thousands, LVS itself becomes a bottleneck, and geographic latency becomes noticeable.

3.9 Eighth Evolution: DNS Round‑Robin Across Data Centers

DNS is configured with multiple IPs pointing to different data‑center virtual IPs, enabling load balancing at the data‑center level and supporting million‑plus concurrent users by adding more sites.

Database alone cannot satisfy increasingly rich analytical and search requirements.

3.10 Ninth Evolution: Introduce NoSQL and Search Engines

When relational databases cannot handle massive data or complex queries, solutions such as HDFS, HBase, Redis, Elasticsearch, Kylin, and Druid are adopted for storage, key‑value access, full‑text search, and multidimensional analysis.

Adding more components increases system complexity and operational overhead.

3.11 Tenth Evolution: Split Monolithic Application into Smaller Apps

Business modules are separated into independent applications, each with clear responsibilities, while shared configurations are managed via Zookeeper.

Shared modules duplicated across apps make coordinated upgrades difficult.

3.12 Eleventh Evolution: Extract Common Functions as Micro‑services

Common functionalities (user management, order, payment, authentication) are packaged as independent services accessed via HTTP, TCP, or RPC, using frameworks like Dubbo or Spring Cloud for governance.

Different service interfaces increase integration complexity and create tangled call chains.

3.13 Twelfth Evolution: Enterprise Service Bus (ESB) for Unified Access

ESB provides protocol conversion and decouples services, forming an SOA architecture that overlaps with micro‑service concepts.

Growing number of services and applications makes deployment and scaling increasingly challenging.

3.14 Thirteenth Evolution: Containerization

Docker packages applications/services as images; Kubernetes orchestrates dynamic deployment, scaling, and resource isolation, simplifying operations especially during traffic spikes.

Containers solve dynamic scaling but still require on‑premise hardware, leading to low utilization outside peak periods.

3.15 Fourteenth Evolution: Cloud Platform Adoption

Deploy the system on public cloud (IaaS, PaaS, SaaS) to leverage elastic resources, pay‑as‑you‑go, and reduce operational costs.

IaaS : Infrastructure as a Service – raw compute, storage, network.

PaaS : Platform as a Service – ready‑to‑use middleware and frameworks.

SaaS : Software as a Service – complete applications delivered on demand.

Architecture Design Summary

Design should be guided by principles such as N+1 redundancy, rollback capability, feature toggles, built‑in monitoring, multi‑active data‑center, mature technology adoption, resource isolation, horizontal scalability, buying non‑core components, using commercial hardware, rapid iteration, and stateless service interfaces.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

backend architecture microservices Scalability

Written by

Architecture Talk

Rooted in the "Dao" of architecture, we provide pragmatic, implementation‑focused architecture content.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.