Backend Development 8 min read

How Large-Scale Websites Evolve Their Architecture for Performance and Scalability

This article traces the evolution of large‑scale website architectures—from single‑server setups to distributed services—detailing how separation of concerns, caching, load balancing, database sharding, CDN, reverse proxies, NoSQL, and service decomposition collectively improve performance, scalability, and reliability.

21CTO

Nov 21, 2015

How Large-Scale Websites Evolve Their Architecture for Performance and Scalability

Introduction

A mature large website's architecture evolves with user growth and business expansion, gradually adding performance, high availability, and security features.

1. Initial Architecture

At the beginning, the application, database, and files are deployed on a single server.

2. Separation of Application, Data, and Files

As traffic grows, each component is moved to dedicated servers with hardware tuned for its role.

3. Using Caching to Improve Performance

Software caching is applied to hot data (the 80/20 rule) to reduce access latency and improve user experience.

Common approaches include local cache (in‑memory or file‑based, e.g., OSCache) and distributed cache (e.g., Memcached, Redis). CDN and reverse proxy are also used.

4. Clustering Application Servers

Load balancers distribute incoming requests across multiple application servers.

Hardware load balancers (e.g., F5) and software solutions (LVS, Nginx, HAProxy) are common; LVS operates at layer 4, while Nginx and HAProxy support layer 7 features such as static‑dynamic separation.

5. Database Read/Write Separation and Sharding

To alleviate database bottlenecks, read/write separation and horizontal/vertical sharding are employed.

6. CDN and Reverse Proxy for Faster Content Delivery

CDN caches content in ISP data centers close to users, reducing network latency; reverse proxies (e.g., Squid, Nginx) cache responses at the edge of the data center.

7. Distributed File Systems

When file volume exceeds a single server's capacity, distributed file systems such as NFS are adopted.

8. NoSQL and Search Engines for Massive Data Queries

For large‑scale data retrieval, NoSQL databases (e.g., MongoDB, Redis) and search engines (e.g., Lucene) are used.

9. Business‑Level Application Decomposition

When applications become too monolithic, they are split into independent business services (e.g., news, images) that communicate via messages or shared databases.

10. Building Distributed Services

Common business services (user, order, payment, security) are extracted into a distributed service framework; Dubbo is a typical solution.

Conclusion

Large‑scale website architectures continuously evolve to meet business demands, employing a range of techniques such as component separation, caching, load balancing, sharding, CDN, reverse proxies, NoSQL, and service decomposition to achieve performance, scalability, and reliability.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

distributed systems scalability load balancing Caching website architecture

Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.