Backend Development 25 min read

Evolution of Web Architecture: From Single‑Server Sites to Distributed Multi‑Machine Systems

The article traces web architecture’s evolution from simple single‑server Java/JSP sites through memory caching, load‑balanced multi‑machine logic, read/write separation, master‑slave replication or message‑queue syncing, horizontal/vertical sharding, and finally multi‑server web‑server deployments behind load balancers to achieve scalability and reliability.

Baidu Tech Salon

Apr 29, 2014

Evolution of Web Architecture: From Single‑Server Sites to Distributed Multi‑Machine Systems

In 2005 I started building websites on Linux using JSP, later adding frameworks such as WebWork and Hibernate. After joining a company I switched to C/C++ for distributed computing and storage, which sparked the question: "What else can C be used for besides HelloWorld?"

Websites need different architectures depending on requirements, traffic, and business models. I will outline the stages I have experienced.

Stage 1 – Building Your Own Site

Early sites were simple Java applications using frameworks like Struts, Spring, and Hibernate for URL routing, MVC separation, and ORM. While these frameworks improve code reuse, they may introduce reflection overhead and reduce raw performance, which is acceptable when traffic is low.

Stage 3 – Reducing Disk Pressure

As traffic grows, disk I/O becomes a bottleneck. Linux tools (vmstat, iostat) show high disk wait times. The classic solution is to introduce memory caching. Although memory is volatile, it can be used for temporary data, while persistent data stays on disk.

Cache is usually key‑value based and often implemented with LRU eviction. Two common cache architectures are:

In‑process (penetration) cache – the application links directly to a cache library.

Out‑of‑process (bypass) cache – a separate service such as Memcached provides cache access.

By sizing the cache appropriately, hit rates can increase dramatically, often improving response time by an order of magnitude. However, cache failures can shift load back to the database, so cache reliability must be considered.

Stage 5 – Multi‑Machine Logic

When a single server reaches CPU limits, the application is split across multiple machines. Load balancing (random, round‑robin, consistent hashing) distributes requests, while session synchronization becomes a challenge. Three typical solutions are:

Session synchronization across nodes.

Centralized session service.

Sticky sessions using consistent hashing (the simplest but vulnerable to node loss).

Database locking and transaction isolation are used to handle concurrent writes, but excessive locking hurts performance.

Stage 6 – Read/Write Separation

Single‑node databases eventually become a bottleneck due to read‑write contention. Strategies include reducing read volume (higher cache hit rates, multi‑level caches) and reducing write volume (batching writes). Distributed caches and consistent‑hash sharding help balance load across nodes.

Data synchronization can be achieved via master‑slave replication or by using a message queue (MQ) to broadcast changes to multiple consumers. MQ‑based designs provide high scalability but introduce eventual consistency and single‑point‑of‑failure concerns.

Stage 7 – Sharding (Splitting)

When data size grows, horizontal or vertical sharding is applied. Horizontal sharding splits rows across multiple tables/databases (e.g., by range or modulo), while vertical sharding separates columns into different tables. Both reduce per‑node load and improve performance.

Practical example: a blog platform splits image, blog, and user services by domain (img.example.com, blog.example.com, user.example.com) and further shards the blog data horizontally.

Stage 8 – WebServer Multi‑Machine Deployment

To avoid a single point of failure, multiple WebServers are placed behind a load‑balancing layer. DNS‑based routing or a virtual server (VS) layer such as LVS can distribute traffic without requiring a separate public IP for each backend server. Hardware (e.g., F5) or software load balancers provide the necessary scalability and reliability.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

distributed systems Scalability load balancing Caching web architecture database sharding

Written by

Baidu Tech Salon

Baidu Tech Salon, organized by Baidu's Technology Management Department, is a monthly offline event that shares cutting‑edge tech trends from Baidu and the industry, providing a free platform for mid‑to‑senior engineers to exchange ideas.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.