Evolution of Large-Scale Website Architecture: From Single Server to Distributed Services
The article outlines how mature large‑site architectures evolve from a single‑server setup to multi‑layered systems featuring separated application, data and file tiers, caching, load‑balancing, database sharding, CDN, distributed file systems, NoSQL/search, business‑level service decomposition and distributed service frameworks.
Large‑scale websites such as Taobao or JD do not start with a complete high‑performance, high‑availability architecture; they evolve as user traffic and business functions grow, prompting changes in development models, technical stacks, and design philosophies.
1. Initial architecture : All components (application, database, files) are deployed on a single server.
2. Separation of application, data, and files : As load increases, each tier is moved to its own server with hardware tuned for its role.
3. Caching for performance : Hot‑spot data (≈80% of requests on 20% of data) is cached using local caches (e.g., OSCache) or distributed caches such as Memcached and Redis; CDNs and reverse proxies are also employed.
4. Server clustering : Front‑end load balancers (hardware F5 or software LVS, Nginx, HAProxy) distribute requests across multiple application servers, enabling horizontal scaling.
5. Database read/write splitting and sharding : To alleviate DB bottlenecks, read/write separation (master‑slave) and horizontal/vertical sharding are applied.
6. CDN and reverse proxy : CDN caches static content in ISP nodes close to users, while reverse proxies (e.g., Squid, Nginx) cache dynamic responses to reduce origin server load.
7. Distributed file system : Growing file volumes require systems like NFS to provide scalable storage across multiple nodes.
8. NoSQL and search engines : For massive data queries, NoSQL stores (MongoDB, Redis) combined with search engines like Lucene improve performance.
9. Business‑level service decomposition : As applications become monolithic, they are split into independent business services (e.g., news, image, search) that communicate via messages or shared databases.
10. Distributed service framework : Common business services (user, order, payment, security) are extracted into a distributed service layer; frameworks like Dubbo are typical choices.
Conclusion : The architecture of a large website continuously matures according to business needs, employing a set of common techniques such as tier separation, caching, load balancing, sharding, CDN, distributed storage, NoSQL, and service‑oriented design.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.