Backend Development 25 min read

Evolution of Server Architecture for High Concurrency: From Single‑Machine to Cloud‑Native Solutions

This article uses Taobao as a case study to trace the step‑by‑step evolution of server‑side architecture from a single‑machine deployment to distributed clusters, caching, load‑balancing, database sharding, micro‑services, containerization and finally cloud platforms, summarizing the key technologies and design principles at each stage.

Architecture Digest

Jun 3, 2019

Evolution of Server Architecture for High Concurrency: From Single‑Machine to Cloud‑Native Solutions

1. Overview

This article uses Taobao as an example to illustrate the evolution of server‑side architecture from a hundred concurrent users to tens of millions, enumerating the technologies encountered at each stage and providing a holistic view of architectural progression together with a set of design principles.

2. Basic Concepts

Before introducing the architecture, we briefly define several fundamental concepts to ensure all readers share a common understanding.

Distributed Multiple modules deployed on different servers constitute a distributed system, e.g., Tomcat and the database running on separate machines.

High Availability When some nodes fail, other nodes can take over and continue providing service.

Cluster A group of servers offering a unified service, such as Zookeeper's master‑slave nodes, where clients can connect to any node and failover occurs automatically.

Load Balancing Requests are evenly distributed across multiple nodes so that each node handles a comparable load.

Forward and Reverse Proxy A forward proxy acts on behalf of internal systems to access external networks, while a reverse proxy receives external requests and forwards them to internal servers.

3. Architecture Evolution

3.1 Single‑Machine Architecture

In the early stage of Taobao, both the application (Tomcat) and the database were deployed on the same server because traffic was low. A user’s browser resolves www.taobao.com to an IP address (e.g., 10.102.4.1) and then contacts the Tomcat on that machine.

As the number of users grows, competition for resources between Tomcat and the database makes a single machine insufficient.

3.2 First Evolution: Separate Tomcat and Database

Tomcat and the database each occupy their own server, dramatically improving the performance of both components.

With further growth, concurrent reads/writes to the database become the bottleneck.

3.3 Second Evolution: Introduce Local and Distributed Caches

Local caches (e.g., Memcached) are added inside the Tomcat JVM, while a distributed cache (Redis) is deployed externally to store hot product data or HTML pages. This intercepts the majority of requests before they hit the database, reducing database pressure. Issues such as cache consistency, cache penetration, cache avalanche, and hot‑spot invalidation also arise.

The cache absorbs most traffic, but as users continue to increase, the remaining load falls on the single Tomcat, causing response times to degrade.

3.4 Third Evolution: Reverse Proxy for Load Balancing

Multiple Tomcat instances are deployed behind a reverse‑proxy (Nginx). Nginx distributes incoming HTTP requests across the Tomcat pool. Assuming each Tomcat handles 100 concurrent connections and Nginx can handle 50 000, 500 Tomcat instances could theoretically support 50 000 concurrent users. Technologies involved include Nginx, HAProxy, session sharing, and file upload/download handling.

While reverse proxy greatly raises the concurrency capacity of the application servers, the database becomes the next bottleneck.

3.5 Fourth Evolution: Database Read‑Write Splitting

The database is divided into a write master and multiple read replicas. Writes go to the master; reads are served by the replicas, synchronized via middleware such as Mycat. This also introduces data‑sync and consistency challenges.

Different business lines now compete for database resources, leading to performance interference.

3.6 Fifth Evolution: Business‑Based Database Sharding

Data for each business domain is stored in separate databases, reducing cross‑business contention. However, cross‑business joins become difficult and require additional solutions.

As traffic grows, the single write‑master eventually hits its performance ceiling.

3.7 Sixth Evolution: Split Large Tables into Small Tables

Large tables (e.g., comments, payment records) are hashed or time‑partitioned into many smaller tables, enabling horizontal scaling. Mycat can manage routing and access control for these sharded tables.

This approach turns the database into a distributed system, often realized as an MPP (Massively Parallel Processing) architecture. Open‑source MPP databases such as Greenplum, TiDB, PostgreSQL‑XC, and commercial ones like GBase, SnowballDB, and Huawei LibrA are mentioned.

Both the database and Tomcat can now scale horizontally, but the single Nginx instance becomes the next bottleneck.

3.8 Seventh Evolution: LVS or F5 for Multi‑Nginx Load Balancing

When Nginx becomes a bottleneck, a layer‑4 load balancer such as LVS (software) or F5 (hardware) is introduced. LVS runs in kernel space and can forward tens of thousands of TCP connections; F5 offers higher performance at higher cost. High availability is achieved with keepalived and virtual IPs.

Even LVS eventually reaches a limit when concurrency reaches hundreds of thousands, and geographic latency becomes a concern.

3.9 Eighth Evolution: DNS Round‑Robin for Inter‑Data‑Center Load Balancing

Multiple IP addresses are assigned to a single domain in DNS, each pointing to a different data‑center. DNS round‑robin (or other policies) distributes users across data‑centers, achieving data‑center‑level horizontal scaling.

As data and business become richer, a pure database can no longer satisfy all requirements.

3.10 Ninth Evolution: Introduce NoSQL and Search Engines

When relational databases cannot handle massive data or complex queries, specialized solutions are added: HDFS for large file storage, HBase/Redis for key‑value stores, ElasticSearch for full‑text search, Kylin/Druid for multi‑dimensional analytics, etc.

Adding more components increases system complexity and introduces consistency and operational challenges.

3.11 Tenth Evolution: Split a Monolithic Application into Smaller Services

Code is divided by business domain, allowing each service to evolve independently. Shared configuration can be managed via a distributed configuration center such as Zookeeper.

Duplicated common modules across services increase maintenance effort.

3.12 Eleventh Evolution: Extract Reusable Functions into Micro‑services

Common functionalities (user management, order, payment, authentication) are packaged as independent micro‑services accessed via HTTP, TCP, or RPC. Frameworks like Dubbo or Spring Cloud provide service governance, rate limiting, circuit breaking, etc.

Different services may require different access protocols, leading to a complex call chain.

3.13 Twelfth Evolution: Introduce an Enterprise Service Bus (ESB)

ESB unifies protocol conversion, allowing applications to access backend services through a single entry point and reducing coupling between services. This architecture resembles SOA and shares concepts with micro‑services.

As the number of services grows, deployment and operational complexity increase dramatically.

3.14 Thirteenth Evolution: Containerization for Isolation and Dynamic Management

Docker packages applications into images, while Kubernetes (K8s) orchestrates their deployment, scaling, and lifecycle management, simplifying operations and enabling rapid scaling during peak periods.

Even with containers, the underlying hardware still needs to be owned and managed, leading to under‑utilized resources during off‑peak times.

3.15 Fourteenth Evolution: Move to Cloud Platforms

The system is deployed on public cloud infrastructure, leveraging elastic resources to handle traffic spikes (e.g., during large‑scale promotions). Combined with Docker and K8s, resources can be provisioned on‑demand and released afterward, achieving cost‑effective scaling.

Cloud services are categorized as IaaS (infrastructure), PaaS (platform), and SaaS (software), each providing different levels of abstraction and managed capabilities.

While cloud platforms solve many hardware‑related problems, they also introduce new challenges such as cross‑region data synchronization and distributed transaction management.

4. Architecture Design Summary

Must the architecture follow the exact evolution path described? No. The sequence is illustrative; real‑world scenarios may require addressing multiple bottlenecks simultaneously or prioritizing different concerns.

How detailed should the architecture be for a given system? For a one‑off project with clear performance targets, design enough to meet those targets while leaving extension points. For continuously evolving platforms (e.g., e‑commerce), design for the next growth stage and iterate.

What is the difference between server‑side architecture and big‑data architecture? Big‑data architecture focuses on data collection, storage, processing, and analysis (HDFS, Spark, NoSQL, etc.), whereas server‑side architecture deals with application organization and relies on big‑data components for underlying capabilities.

Key design principles:

N+1 design – avoid single points of failure.

Rollback capability – ensure forward compatibility and version rollback.

Feature toggle – allow quick disabling of problematic features.

Monitoring – embed observability from the design phase.

Multi‑active data centers – achieve high availability across locations.

Adopt mature technologies – reduce risk from immature open‑source projects.

Resource isolation – prevent one business from monopolizing resources.

Horizontal scalability – design for scale‑out to avoid bottlenecks.

Buy non‑core components – leverage commercial solutions when development cost is high.

Use commercial hardware – improve reliability.

Rapid iteration – develop small features quickly to validate and reduce delivery risk.

Stateless design – keep service interfaces stateless.

Source: https://segmentfault.com/a/1190000018626163

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

architecture Cloud Computing scalability load balancing database sharding

Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.