Backend Development 23 min read

Evolution of High‑Concurrency Backend Architecture: From Single‑Machine to Cloud‑Native Solutions

The article walks through Taobao's backend architecture evolution—from a single‑machine setup to distributed caching, load balancing, database sharding, microservices, containerization, and finally cloud deployment—explaining each stage's technologies, challenges, and design principles for building scalable, highly available systems.

Selected Java Interview Questions
Selected Java Interview Questions
Selected Java Interview Questions
Evolution of High‑Concurrency Backend Architecture: From Single‑Machine to Cloud‑Native Solutions

1. Overview

This article uses Taobao as an example to introduce the evolution of server‑side architecture from handling a few hundred concurrent requests to tens of millions, listing the technologies encountered at each stage and summarizing architectural design principles.

2. Basic Concepts

Before discussing architecture, several fundamental concepts are introduced:

Distributed : Multiple modules deployed on different servers, e.g., Tomcat and the database on separate machines.

High Availability : The system continues to provide service when some nodes fail.

Cluster : A set of servers providing a unified service, such as Zookeeper's master‑slave ensemble.

Load Balancing : Evenly distributing incoming requests across multiple nodes.

Forward and Reverse Proxy : Forward proxy lets internal systems access external networks; reverse proxy forwards external requests to internal servers.

3. Architecture Evolution

3.1 Single‑Machine Architecture

Initially, Tomcat and the database are deployed on the same server. As user numbers grow, resource contention makes this setup insufficient.

With user growth, Tomcat and the database compete for resources; a single machine cannot sustain the load.

3.2 First Evolution: Separate Tomcat and Database

Tomcat and the database each occupy dedicated servers, significantly improving performance.

As users increase, concurrent reads/writes to the database become the bottleneck.

3.3 Second Evolution: Introduce Local and Distributed Caches

Local cache (e.g., memcached) is added on the Tomcat JVM, and a distributed cache (Redis) is deployed externally to store hot items or HTML pages, dramatically reducing database load.

Cache handles most requests, but Tomcat eventually becomes the performance limiter.

3.4 Third Evolution: Reverse Proxy for Load Balancing

Multiple Tomcat instances are deployed and a reverse‑proxy (Nginx) distributes requests evenly. Assuming each Tomcat handles 100 concurrent connections and Nginx 50 000, the system can theoretically support 50 000 concurrent users.

Reverse proxy greatly increases application concurrency, but the database soon becomes the new bottleneck.

3.5 Fourth Evolution: Database Read/Write Splitting

The database is divided into a write master and multiple read replicas (e.g., using Mycat). Writes go to the master, reads are served by replicas, and caching can provide the latest data when needed.

Different business modules compete for database resources, causing performance interference.

3.6 Fifth Evolution: Business‑Level Sharding

Data for each business is stored in separate databases, reducing contention. High‑traffic businesses can be scaled independently, though cross‑business joins become more complex.

The write master eventually hits performance limits as user count grows.

3.7 Sixth Evolution: Split Large Tables into Small Tables

Tables are horizontally partitioned (e.g., by product ID or hour) so that data is evenly distributed across many servers. This enables MPP‑style parallel processing using databases such as Greenplum, TiDB, or PostgreSQL‑XC.

Both the database and Tomcat can now scale horizontally, but Nginx becomes the next bottleneck.

3.8 Seventh Evolution: LVS/F5 for Multi‑Nginx Load Balancing

LVS (software) or F5 (hardware) operates at layer 4 to balance traffic among multiple Nginx instances, supporting hundreds of thousands of concurrent connections. Keepalived provides virtual IP failover for high availability.

When LVS reaches its limit, further scaling requires multi‑region deployment.

3.9 Eighth Evolution: DNS Round‑Robin for Inter‑Data‑Center Balancing

DNS is configured with multiple IPs, each pointing to a different data‑center. Users are directed to a data‑center via round‑robin or other policies, achieving data‑center‑level horizontal scaling.

Database alone cannot satisfy increasingly rich analytical and retrieval needs.

3.10 Ninth Evolution: Introduce NoSQL and Search Engines

For massive data, solutions such as HDFS, HBase, Redis, Elasticsearch, Kylin, or Druid are added to handle file storage, key‑value access, full‑text search, and multidimensional analytics.

Adding components increases system complexity and operational overhead.

3.11 Tenth Evolution: Split Large Application into Smaller Services

Code is divided by business domain, allowing independent deployment and upgrades. Shared configuration can be managed via Zookeeper.

Duplicated common modules across applications make coordinated upgrades difficult.

3.12 Eleventh Evolution: Extract Reusable Functions as Microservices

Common functionalities (user management, order, payment, authentication) are extracted into independent services accessed via HTTP, TCP, or RPC. Frameworks like Dubbo or Spring Cloud provide service governance, rate limiting, circuit breaking, and degradation.

Different services use different access methods, increasing integration complexity.

3.13 Twelfth Evolution: Introduce Enterprise Service Bus (ESB)

ESB unifies protocol conversion and service invocation, reducing coupling between applications and services. This architecture resembles SOA and overlaps with microservices.

Growing number of services and deployments makes operations increasingly difficult.

3.14 Thirteenth Evolution: Containerization

Docker packages applications into images; Kubernetes orchestrates deployment, scaling, and management, simplifying operations and enabling rapid resource provisioning for traffic spikes.

Even with containers, idle machines are needed for peak periods, leading to high cost.

3.15 Fourteenth Evolution: Move to Cloud Platform

The system is deployed on public cloud (IaaS, PaaS, SaaS), allowing on‑demand resource allocation, cost‑effective scaling, and access to shared services such as Hadoop, MPP databases, and ready‑made applications.

4. Architecture Design Summary

Must the architecture follow the exact evolution path described? No. The order is illustrative; real projects may address multiple bottlenecks simultaneously or prioritize different concerns based on business needs.

How detailed should the design be for an upcoming system? For a one‑off project with clear performance targets, design enough to meet those targets while leaving extension points. For continuously evolving platforms (e.g., e‑commerce), design for the next growth stage and iterate.

Difference between backend architecture and big‑data architecture? Big‑data architecture focuses on data collection, storage, processing, and analysis (HDFS, Spark, NoSQL, etc.), whereas backend architecture deals with application organization and service delivery, often relying on big‑data components for underlying capabilities.

Key architectural principles:

N+1 design – no single point of failure.

Rollback capability – ensure forward compatibility and version rollback.

Feature toggle – ability to disable functions quickly during incidents.

Monitoring – design monitoring from the start.

Multi‑active data centers – maintain service availability across locations.

Use mature technologies – avoid untested or unsupported solutions.

Resource isolation – prevent a single business from monopolizing resources.

Horizontal scalability – design for scaling out to avoid bottlenecks.

Buy non‑core components – reduce development effort for ancillary features.

Commercial hardware – lower hardware failure rates.

Rapid iteration – develop small features quickly to validate and reduce risk.

Stateless design – services should not rely on previous request state.

Author: huashiou – source: segmentfault.com/a/1190000018626163
backenddistributed systemsmicroservicesscalabilityLoad balancingCachingcloud
Selected Java Interview Questions
Written by

Selected Java Interview Questions

A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.