From Single Server to Cloud: 14 Stages of Scaling a Large Website

This article walks through the step‑by‑step evolution of a high‑traffic e‑commerce site—from a single‑machine setup to cloud‑native microservices—detailing each architectural milestone, the technologies involved, and the design principles that guide scaling from hundreds to millions of concurrent users.

Open Source Linux
Open Source Linux
Open Source Linux
From Single Server to Cloud: 14 Stages of Scaling a Large Website

1. Overview

Using Taobao as an example, the article describes the architectural evolution of a website as traffic grows from hundreds to tens of millions of concurrent users, listing the technologies encountered at each stage and summarizing design principles.

Note: The example is illustrative and does not reflect the actual Taobao architecture.

2. Basic Concepts

Distributed – modules deployed on different servers.

High Availability – system continues to serve when some nodes fail.

Cluster – a group of servers providing a unified service.

Load Balancing – evenly distributing requests across nodes.

Forward and Reverse Proxy – forward proxy handles outbound traffic from internal systems; reverse proxy receives inbound traffic and forwards it to internal servers.

3. Architecture Evolution

3.1 Single‑machine Architecture

Initially Tomcat and the database run on the same server; DNS resolves the domain to the server IP.

As user count grows, competition for resources makes the single‑machine approach insufficient.

3.2 First Evolution: Separate Tomcat and Database

Tomcat and the database are deployed on separate servers, improving the performance of each.

Concurrent database reads/writes become the new bottleneck.

3.3 Second Evolution: Local and Distributed Caching

Introduce local cache (e.g., memcached) and distributed cache (Redis) to cache hot items and HTML, reducing database load. Issues such as cache consistency, penetration, breakdown, avalanche, and hot‑spot invalidation are discussed.

Cache handles most traffic, but Tomcat becomes the next bottleneck.

3.4 Third Evolution: Reverse Proxy Load Balancing

Deploy multiple Tomcat instances behind Nginx (or HAProxy) to distribute requests, dramatically increasing concurrent capacity.

Database becomes the next bottleneck as request volume grows.

3.5 Fourth Evolution: Database Read/Write Separation

Separate read and write databases; Mycat can be used as middleware to manage read/write splitting and sharding.

Different business workloads compete for database resources.

3.6 Fifth Evolution: Business‑Level Sharding

Store data for each business in separate databases, reducing contention but making cross‑business queries harder.

Write‑side database eventually hits performance limits.

3.7 Sixth Evolution: Splitting Large Tables

Hash‑based or time‑based partitioning creates many small tables, enabling horizontal scaling. The article mentions MPP databases such as Greenplum, TiDB, PostgreSQL‑XC, HAWQ, and commercial solutions, highlighting their suitability for OLTP or OLAP workloads.

Even with horizontal scaling, Nginx can become the bottleneck.

3.8 Seventh Evolution: LVS/F5 Load Balancing for Multiple Nginx

LVS (software) or F5 (hardware) operate at layer 4 to balance traffic among many Nginx instances; keepalived provides high availability by assigning a virtual IP to multiple LVS nodes.

At massive scale, LVS itself becomes a bottleneck, and geographic latency appears.

3.9 Eighth Evolution: DNS Round‑Robin Across Data Centers

Configure DNS to return multiple IPs, each pointing to a different data‑center, achieving inter‑data‑center load balancing.

Data richness and business growth increase analysis demands beyond a single database.

3.10 Ninth Evolution: Introducing NoSQL and Search Engines

Adopt HDFS for file storage, HBase/Redis for key‑value, Elasticsearch for full‑text search, and Kylin/Druid for multidimensional analysis to handle large‑scale data and diverse query needs.

Adding components raises system complexity and operational overhead.

3.11 Tenth Evolution: Splitting a Monolith into Smaller Applications

Divide code by business domain, using Zookeeper as a distributed configuration center.

Shared modules across applications cause duplication and upgrade challenges.

3.12 Eleventh Evolution: Extracting Reusable Functions as Microservices

Common functions (user management, order, payment, authentication) become independent services accessed via HTTP, TCP, or RPC; frameworks like Dubbo or Spring Cloud provide service governance, rate limiting, circuit breaking, and degradation.

Service interfaces vary, increasing integration complexity.

3.13 Twelfth Evolution: Enterprise Service Bus (ESB) for Interface Unification

ESB abstracts protocol conversion, enabling a SOA‑style architecture where applications and services communicate through a unified bus, reducing coupling.

Growing number of services makes deployment and environment isolation harder.

3.14 Thirteenth Evolution: Containerization

Docker packages services into images; Kubernetes orchestrates containers, simplifying deployment, scaling, and isolation.

Containers solve scaling but still require on‑premise hardware, leading to low utilization outside peak periods.

3.15 Fourteenth Evolution: Cloud Platform Adoption

Deploy to public cloud (IaaS, PaaS, SaaS) to leverage elastic resources, reduce operational cost, and use shared components such as Hadoop stacks or MPP databases.

The article omits challenges such as cross‑region data synchronization and distributed transaction implementation.

4. Architecture Design Summary

Is the evolution path mandatory? No; real projects may address multiple issues simultaneously.

How detailed should the design be? Sufficient to meet current performance goals while leaving room for future expansion.

Difference between service‑side and big‑data architecture? Service architecture focuses on application organization; big‑data architecture provides storage, computation, and analysis capabilities.

Design principles include N+1 redundancy, rollback capability, feature toggles, monitoring, multi‑active data centers, mature technology adoption, resource isolation, horizontal scalability, buying non‑core components, using commercial hardware, rapid iteration, and stateless interfaces.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Backend Architecturecloud computingMicroservicesScalability
Open Source Linux
Written by

Open Source Linux

Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.