From Single Server to Cloud: 14 Stages of Scaling a Large Website
This article walks through the step‑by‑step evolution of a high‑traffic e‑commerce site—from a single‑machine setup to cloud‑native microservices—detailing each architectural milestone, the technologies involved, and the design principles that guide scaling from hundreds to millions of concurrent users.
1. Overview
Using Taobao as an example, the article describes the architectural evolution of a website as traffic grows from hundreds to tens of millions of concurrent users, listing the technologies encountered at each stage and summarizing design principles.
Note: The example is illustrative and does not reflect the actual Taobao architecture.
2. Basic Concepts
Distributed – modules deployed on different servers.
High Availability – system continues to serve when some nodes fail.
Cluster – a group of servers providing a unified service.
Load Balancing – evenly distributing requests across nodes.
Forward and Reverse Proxy – forward proxy handles outbound traffic from internal systems; reverse proxy receives inbound traffic and forwards it to internal servers.
3. Architecture Evolution
3.1 Single‑machine Architecture
Initially Tomcat and the database run on the same server; DNS resolves the domain to the server IP.
As user count grows, competition for resources makes the single‑machine approach insufficient.
3.2 First Evolution: Separate Tomcat and Database
Tomcat and the database are deployed on separate servers, improving the performance of each.
Concurrent database reads/writes become the new bottleneck.
3.3 Second Evolution: Local and Distributed Caching
Introduce local cache (e.g., memcached) and distributed cache (Redis) to cache hot items and HTML, reducing database load. Issues such as cache consistency, penetration, breakdown, avalanche, and hot‑spot invalidation are discussed.
Cache handles most traffic, but Tomcat becomes the next bottleneck.
3.4 Third Evolution: Reverse Proxy Load Balancing
Deploy multiple Tomcat instances behind Nginx (or HAProxy) to distribute requests, dramatically increasing concurrent capacity.
Database becomes the next bottleneck as request volume grows.
3.5 Fourth Evolution: Database Read/Write Separation
Separate read and write databases; Mycat can be used as middleware to manage read/write splitting and sharding.
Different business workloads compete for database resources.
3.6 Fifth Evolution: Business‑Level Sharding
Store data for each business in separate databases, reducing contention but making cross‑business queries harder.
Write‑side database eventually hits performance limits.
3.7 Sixth Evolution: Splitting Large Tables
Hash‑based or time‑based partitioning creates many small tables, enabling horizontal scaling. The article mentions MPP databases such as Greenplum, TiDB, PostgreSQL‑XC, HAWQ, and commercial solutions, highlighting their suitability for OLTP or OLAP workloads.
Even with horizontal scaling, Nginx can become the bottleneck.
3.8 Seventh Evolution: LVS/F5 Load Balancing for Multiple Nginx
LVS (software) or F5 (hardware) operate at layer 4 to balance traffic among many Nginx instances; keepalived provides high availability by assigning a virtual IP to multiple LVS nodes.
At massive scale, LVS itself becomes a bottleneck, and geographic latency appears.
3.9 Eighth Evolution: DNS Round‑Robin Across Data Centers
Configure DNS to return multiple IPs, each pointing to a different data‑center, achieving inter‑data‑center load balancing.
Data richness and business growth increase analysis demands beyond a single database.
3.10 Ninth Evolution: Introducing NoSQL and Search Engines
Adopt HDFS for file storage, HBase/Redis for key‑value, Elasticsearch for full‑text search, and Kylin/Druid for multidimensional analysis to handle large‑scale data and diverse query needs.
Adding components raises system complexity and operational overhead.
3.11 Tenth Evolution: Splitting a Monolith into Smaller Applications
Divide code by business domain, using Zookeeper as a distributed configuration center.
Shared modules across applications cause duplication and upgrade challenges.
3.12 Eleventh Evolution: Extracting Reusable Functions as Microservices
Common functions (user management, order, payment, authentication) become independent services accessed via HTTP, TCP, or RPC; frameworks like Dubbo or Spring Cloud provide service governance, rate limiting, circuit breaking, and degradation.
Service interfaces vary, increasing integration complexity.
3.13 Twelfth Evolution: Enterprise Service Bus (ESB) for Interface Unification
ESB abstracts protocol conversion, enabling a SOA‑style architecture where applications and services communicate through a unified bus, reducing coupling.
Growing number of services makes deployment and environment isolation harder.
3.14 Thirteenth Evolution: Containerization
Docker packages services into images; Kubernetes orchestrates containers, simplifying deployment, scaling, and isolation.
Containers solve scaling but still require on‑premise hardware, leading to low utilization outside peak periods.
3.15 Fourteenth Evolution: Cloud Platform Adoption
Deploy to public cloud (IaaS, PaaS, SaaS) to leverage elastic resources, reduce operational cost, and use shared components such as Hadoop stacks or MPP databases.
The article omits challenges such as cross‑region data synchronization and distributed transaction implementation.
4. Architecture Design Summary
Is the evolution path mandatory? No; real projects may address multiple issues simultaneously.
How detailed should the design be? Sufficient to meet current performance goals while leaving room for future expansion.
Difference between service‑side and big‑data architecture? Service architecture focuses on application organization; big‑data architecture provides storage, computation, and analysis capabilities.
Design principles include N+1 redundancy, rollback capability, feature toggles, monitoring, multi‑active data centers, mature technology adoption, resource isolation, horizontal scalability, buying non‑core components, using commercial hardware, rapid iteration, and stateless interfaces.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Open Source Linux
Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
