From Single Server to Cloud Native: How Taobao Scaled to Millions of Users

This article uses Taobao as a case study to trace the evolution of its server‑side architecture from a single‑machine setup to a cloud‑native, micro‑service ecosystem, detailing each technical milestone, the challenges addressed, and the core design principles for high‑concurrency systems.

Java Interview Crash Guide
Java Interview Crash Guide
Java Interview Crash Guide
From Single Server to Cloud Native: How Taobao Scaled to Millions of Users

Overview

This article uses Taobao as a case study to illustrate the evolution of server‑side architecture from a few concurrent users to tens of millions, listing the technologies encountered at each stage and summarizing architectural design principles.

Basic Concepts

Distributed : Multiple modules deployed on different servers, e.g., Tomcat and database on separate machines.

High Availability : When some nodes fail, others take over to continue providing service.

Cluster : A set of software instances running on multiple servers that together provide a service; clients can connect to any node, and failed nodes are automatically replaced.

Load Balancing : Distributing incoming requests evenly across multiple nodes so each handles a similar load.

Forward Proxy and Reverse Proxy : A forward proxy forwards internal requests to external networks, while a reverse proxy receives external requests and forwards them to internal servers.

Architecture Evolution

3.1 Single‑machine Architecture

At the very beginning, both Tomcat and the database run on the same server. Users access www.taobao.com, DNS resolves the domain to an IP, and the browser contacts the Tomcat directly.

As traffic grows, Tomcat and the database compete for resources, and a single machine can no longer sustain the load.

3.2 First Evolution: Separate Tomcat and Database

Tomcat and the database are deployed on separate servers, dramatically improving the performance of each.

Increasing traffic makes database read/write the new bottleneck.

3.3 Second Evolution: Introduce Local and Distributed Caches

Local cache is added in the same JVM, and a distributed cache (e.g., Redis) is added externally to store hot product data or HTML pages, intercepting most requests before they hit the database.

Technologies include Memcached for local cache, Redis for distributed cache, and issues such as cache consistency, penetration, breakdown, avalanche, and hot‑data expiration.

Cache handles most traffic, but Tomcat becomes the next bottleneck as user count rises.

3.4 Third Evolution: Reverse Proxy for Load Balancing

Multiple Tomcat instances are deployed and Nginx (or HAProxy) distributes requests among them. Assuming each Tomcat handles 100 concurrent connections and Nginx 50 000, 500 Tomcat instances can support 50 000 concurrent users.

Technologies: Nginx, HAProxy, session sharing, file upload/download.

Reverse proxy greatly increases application concurrency, but the database becomes the new bottleneck.

3.5 Fourth Evolution: Database Read/Write Separation

The database is split into a write master and multiple read replicas (e.g., using Mycat). Writes go to the master, reads are served by replicas; cache can be used to obtain the latest data.

Different business modules compete for database resources, affecting performance.

3.6 Fifth Evolution: Business‑Level Sharding

Data for different business domains are stored in separate databases, reducing contention. High‑traffic services can be scaled independently, though cross‑business queries become more complex.

The write master eventually hits performance limits as traffic grows.

3.7 Sixth Evolution: Split Large Tables into Small Tables

Large tables (e.g., comments, payment records) are partitioned by hash or time, routing rows to many small tables, enabling horizontal scaling.

This approach raises DBA workload; it essentially creates a distributed (MPP) database architecture. Open‑source MPP solutions include Greenplum, TiDB, PostgreSQL‑XC, HAWQ; commercial options include GBase, SnowballDB, Huawei LibrA.

Both Tomcat and the database can now scale horizontally, but Nginx eventually becomes the bottleneck.

3.8 Seventh Evolution: LVS or F5 for Multi‑Nginx Load Balancing

LVS (software) or F5 (hardware) operates at layer 4 to balance traffic among multiple Nginx instances, supporting hundreds of thousands of concurrent connections. High availability is achieved with keepalived and virtual IPs.

When traffic reaches hundreds of thousands, LVS becomes the bottleneck, and geographic latency becomes noticeable.

3.9 Eighth Evolution: DNS Round‑Robin Across Data Centers

DNS maps a domain to multiple IPs, each pointing to a virtual IP in a different data center, achieving data‑center‑level load balancing and horizontal scaling to tens of millions of concurrent users.

Database alone cannot satisfy increasingly rich analytical and search requirements.

3.10 Ninth Evolution: Introduce NoSQL and Search Engines

When relational databases cannot handle massive data or complex queries, solutions such as HDFS, HBase, Redis, Elasticsearch, Kylin, Druid, etc., are introduced.

Adding more components solves functional needs but makes the system harder to evolve.

3.11 Tenth Evolution: Split Large Application into Smaller Services

Code is divided by business domain, allowing independent deployment and upgrades. Shared configuration can be managed via Zookeeper.

Shared modules duplicated across applications increase maintenance effort.

3.12 Eleventh Evolution: Extract Reusable Functions as Microservices

Common functionalities (user management, order, payment, authentication) become independent services accessed via HTTP, TCP, or RPC. Frameworks like Dubbo, Spring Cloud provide service governance, rate limiting, circuit breaking, etc.

Different service interfaces increase integration complexity and can lead to tangled call chains.

3.13 Twelfth Evolution: Enterprise Service Bus (ESB) to Hide Interface Differences

ESB performs protocol conversion and mediates calls between applications and services, reducing coupling. This resembles SOA architecture, which overlaps with microservices.

Growing number of services and deployments makes operations increasingly difficult.

3.14 Thirteenth Evolution: Containerization

Docker packages applications into images; Kubernetes orchestrates deployment, scaling, and management, simplifying operations and enabling rapid scaling for peak traffic.

Containers solve dynamic scaling but still require owned hardware, leading to under‑utilized resources outside peak periods.

3.15 Fourteenth Evolution: Cloud Platform

Deploy the system on public cloud (IaaS, PaaS, SaaS). Resources can be provisioned on demand for large promotions and released afterward, achieving cost efficiency and lower operational overhead.

Architecture Design Summary

Must the architecture follow the exact evolution path? No. The order shown addresses individual pain points; real projects may need to solve multiple issues simultaneously or prioritize different bottlenecks.

How detailed should the design be for an upcoming system? For a one‑off project, design enough to meet performance targets while leaving extension points. For continuously evolving platforms, design for the next growth stage and iterate.

Difference between backend architecture and big‑data architecture? Big‑data architecture focuses on data collection, storage, processing, and analysis (e.g., Hadoop, Spark, NoSQL). Backend architecture concerns application organization; it often relies on big‑data components for underlying capabilities.

Key architectural principles

N+1 design – no single point of failure.

Rollback capability – ensure forward compatibility and version rollback.

Feature toggle – ability to disable functions quickly.

Monitoring – design monitoring from the start.

Active‑active data centers for high availability.

Use mature, battle‑tested technologies.

Resource isolation – prevent one business from monopolizing resources.

Horizontal scalability – design for scaling out.

Buy non‑core components when appropriate.

Use commercial hardware for reliability.

Rapid iteration – develop small features quickly and validate.

Stateless design – services should not depend on previous request state.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

BackendDistributed SystemsMicroservicescloud
Java Interview Crash Guide
Written by

Java Interview Crash Guide

Dedicated to sharing Java interview Q&A; follow and reply "java" to receive a free premium Java interview guide.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.