Backend Development 25 min read

From Single Server to Cloud‑Native: Scaling Taobao’s Backend for Millions of Users

This article traces Taobao’s backend architecture evolution—from a single‑server setup to distributed caching, load‑balancing proxies, database sharding, microservices, containerization, and finally cloud‑native deployment—highlighting the technologies, challenges, and design principles that enable scaling from hundreds to tens of millions of concurrent users.

ITFLY8 Architecture Home

Jul 25, 2019

From Single Server to Cloud‑Native: Scaling Taobao’s Backend for Millions of Users

1. Overview

This article uses Taobao as an example to illustrate the evolution of server‑side architecture from a hundred concurrent users to tens of millions, listing the technologies encountered at each stage and summarizing architectural design principles at the end.

2. Basic Concepts

Before discussing architecture, the following fundamental concepts are introduced for readers unfamiliar with them:

Distributed A system whose modules are deployed on different servers, e.g., Tomcat and the database on separate machines, or two identical Tomcat instances on different servers.

High Availability When some nodes fail, other nodes can take over and continue providing service.

Cluster Software deployed on multiple servers that together provide a service, such as Zookeeper’s master‑slave nodes forming a unified configuration service. Clients can connect to any node, and if one node goes offline, others automatically take over, indicating high availability.

Load Balancing Distributing incoming requests evenly across multiple nodes so each node handles a comparable load.

Forward and Reverse Proxy When internal systems access external networks, a forward proxy forwards the request, appearing as the source to the external network. Conversely, a reverse proxy receives external requests and forwards them to internal servers, so the external client only interacts with the proxy.

3. Architecture Evolution

3.1 Single‑Machine Architecture

In the early stage of Taobao, both the number of applications and users were small, so Tomcat and the database could be deployed on the same server. When a browser requests www.taobao.com, DNS resolves the domain to an IP (e.g., 10.102.4.1), and the browser accesses Tomcat at that IP.

As user count grows, Tomcat and the database compete for resources, and a single machine can no longer sustain the workload.

3.2 First Evolution: Separate Tomcat and Database

Tomcat and the database are deployed on separate servers, significantly improving the performance of each component.

With further growth, concurrent reads/writes to the database become the bottleneck.

3.3 Second Evolution: Introduce Local and Distributed Caches

Local caches (e.g., Memcached) are added within the Tomcat JVM, while a distributed cache (Redis) is placed externally to store hot product data or HTML pages. Caching intercepts most requests before they hit the database, greatly reducing database load. Topics such as cache consistency, penetration, breakdown, avalanche, and hot‑data expiration are also involved.

The cache handles most traffic, but as users increase, the remaining load falls on a single Tomcat, causing response times to degrade.

3.4 Third Evolution: Reverse Proxy for Load Balancing

Multiple Tomcat instances are deployed, and a reverse‑proxy (Nginx) distributes requests evenly. Assuming each Tomcat supports 100 concurrent connections and Nginx 50,000, Nginx can forward traffic to 500 Tomcat instances, handling 50,000 concurrent users. Technologies involved include Nginx, HAProxy, session sharing, and file upload/download handling.

Reverse proxy dramatically increases the supported concurrency, but the database becomes the next bottleneck as more requests penetrate it.

3.5 Fourth Evolution: Database Read‑Write Splitting

The database is divided into a write master and multiple read replicas. Writes go to the master; reads are served by replicas, synchronized from the master. For the freshest data, an extra copy can be kept in cache. Mycat, a database middleware, is used to orchestrate read/write separation and sharding, handling data synchronization and consistency.

Different business lines compete for the same database, causing performance interference.

3.6 Fifth Evolution: Business‑Based Database Sharding

Data for each business line is stored in separate databases, reducing resource contention. High‑traffic businesses can be allocated more servers. Cross‑business joins become difficult and require additional solutions.

The write master eventually hits performance limits as user volume grows.

3.7 Sixth Evolution: Split Large Tables into Small Tables

Large tables are partitioned (e.g., comments hashed by product ID, payment records by hour). With sufficient horizontal distribution, many small tables across multiple servers improve performance. Mycat supports access control for such sharding.

This approach raises DBA workload and turns the system into a logical distributed database, often implemented as an MPP (Massively Parallel Processing) architecture.

Popular open‑source MPP databases include Greenplum, TiDB, PostgreSQL‑XC, HAWQ; commercial options include GBase, SnowballDB, Huawei LibrA. They provide standard SQL, sharding, transaction, replication, and can scale to hundreds of nodes.

Both the database and Tomcat can now scale horizontally, but Nginx becomes the next bottleneck.

3.8 Seventh Evolution: LVS or F5 for Multi‑Nginx Load Balancing

Since Nginx becomes a bottleneck, a layer‑4 load balancer (LVS software or F5 hardware) is introduced. LVS runs in kernel space, handling TCP and higher‑level protocols with much higher throughput. Keepalived can provide virtual IP failover for high availability.

When concurrency reaches hundreds of thousands, LVS itself becomes a bottleneck, and geographic latency becomes noticeable.

3.9 Eighth Evolution: DNS Round‑Robin for Inter‑Data‑Center Balancing

DNS is configured with multiple IPs for a domain, each IP pointing to a virtual IP in a different data center. Users are directed to different data centers via DNS round‑robin, achieving data‑center‑level horizontal scaling.

As data richness and business complexity increase, a single database can no longer satisfy all requirements.

3.10 Ninth Evolution: Introduce NoSQL and Search Engines

When data volume exceeds relational database capabilities, specialized solutions are adopted: HDFS for massive file storage, HBase/Redis for key‑value, ElasticSearch for full‑text search, Kylin/Druid for multidimensional analysis. These additions increase system complexity and require consistency handling.

Adding many components expands business capabilities but also makes code upgrades more difficult.

3.11 Tenth Evolution: Split Monolithic Application into Smaller Services

Applications are divided by business domain, making each service’s responsibilities clearer and allowing independent iteration. Shared configuration can be managed via a distributed configuration center such as Zookeeper.

Duplicated common modules across applications increase maintenance effort.

3.12 Eleventh Evolution: Extract Reusable Functions as Microservices

Common functionalities (user management, order, payment, authentication) are extracted into independent services accessed via HTTP, TCP, or RPC. Frameworks like Dubbo or Spring Cloud provide service governance, rate limiting, circuit breaking, and degradation.

Different services expose different interfaces, requiring client adaptation and creating complex call chains.

3.13 Twelfth Evolution: Enterprise Service Bus (ESB) to Hide Interface Differences

ESB performs protocol conversion so applications uniformly access backend services through the bus, reducing coupling. This architecture resembles SOA; microservices focus on independent deployment, while SOA emphasizes unified access via the bus.

Growing number of services and applications makes deployment and environment management increasingly complex.

3.14 Thirteenth Evolution: Containerization for Isolation and Dynamic Management

Docker packages applications/services into images; Kubernetes (K8s) orchestrates their deployment, scaling, and lifecycle. Containers encapsulate the runtime environment, enabling rapid provisioning and teardown, especially during high‑traffic events.

Although containers simplify scaling, the underlying machines still need to be owned and managed, leading to under‑utilized resources outside peak periods.

3.15 Fourteenth Evolution: Move to Cloud Platforms

The system is deployed on public cloud infrastructure, leveraging elastic resources for peak traffic (e.g., big sales events) and releasing them afterward, achieving pay‑as‑you‑go cost efficiency. Cloud services are categorized as IaaS (infrastructure), PaaS (platform), and SaaS (software), each providing different levels of abstraction and managed components.

While cloud platforms solve hardware scaling, issues such as cross‑region data synchronization and distributed transactions remain.

4. Architecture Design Summary

Must the architecture follow the exact evolution path? No. The sequence presented addresses specific pain points; real‑world scenarios may require tackling multiple issues simultaneously or prioritizing different bottlenecks.

How detailed should the design be for a system about to be implemented? For a one‑off project with clear performance targets, design enough to meet those targets while leaving room for future expansion. For continuously evolving platforms (e.g., e‑commerce), design for the next growth stage and iterate.

What is the difference between server‑side architecture and big‑data architecture? “Big data” refers to solutions for massive data collection, storage, processing, and analysis (e.g., Flume, HDFS, HBase, Spark). Server‑side architecture focuses on application organization; it often relies on big‑data components for underlying capabilities.

Any architectural design principles?

N+1 design – no single point of failure.

Rollback design – ensure forward compatibility and provide rollback mechanisms.

Feature toggle – allow rapid disabling of problematic features.

Monitoring – embed observability from the start.

Multi‑active data centers – achieve high availability across regions.

Use mature technologies – avoid untested open‑source bugs.

Resource isolation – prevent a single business from monopolizing resources.

Horizontal scalability – design for scaling out.

Buy non‑core components – reduce development effort.

Commercial hardware – improve reliability.

Rapid iteration – develop small features quickly for early feedback.

Stateless design – services should not depend on previous requests.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Backend cloud-native Microservices Scalability

Written by

ITFLY8 Architecture Home

ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.