Industry Insights 9 min read

How Major E‑Commerce Sites Evolve Their Architecture for Scale and Performance

This article traces the step‑by‑step evolution of large‑scale website architectures—from single‑server setups to distributed services—highlighting key techniques such as server clustering, caching, load balancing, database sharding, CDN usage, and the adoption of NoSQL and micro‑service frameworks.

Big Data and Microservices

May 9, 2016

How Major E‑Commerce Sites Evolve Their Architecture for Scale and Performance

Introduction

Large‑scale websites like major e‑commerce platforms do not start with a fully optimized, high‑performance architecture; instead, their systems evolve as user traffic and business features grow, prompting changes in development models, technical stacks, and design philosophies.

1. Initial Architecture

Early deployments typically host the application, database, and file storage on a single server.

2. Separation of Application, Data, and Files

When a single server can no longer meet performance demands, the application, database, and file storage are moved to independent servers, each provisioned with hardware suited to its workload.

Separated application, database, and file servers

3. Caching for Performance

Because most traffic follows the 80/20 rule, caching hot data dramatically reduces access latency. Common approaches include local (in‑memory or file‑based) caches such as OSCache, and distributed caches like Memcached and Redis. CDNs and reverse proxies are also used.

4. Server Clustering and Load Balancing

Application servers are grouped into clusters behind a load balancer that distributes requests. Hardware solutions (e.g., F5) and software solutions (LVS, Nginx, HAProxy) are compared: LVS operates at layer 4 with higher raw performance, while Nginx and HAProxy provide layer‑7 routing and richer configuration options such as static‑dynamic content separation.

Load‑balancer distributing traffic to server nodes

5. Database Read/Write Splitting and Sharding

To alleviate database bottlenecks, read/write separation creates dedicated read replicas synchronized from a primary write node. Horizontal (sharding) and vertical (segregating tables by business domain) partitioning further distribute load, for example splitting a massive user table across multiple databases.

Database read/write split and sharding diagram

6. CDN and Reverse Proxy

Geographic latency is mitigated by CDNs that cache content in ISP data centers close to users, reducing round‑trip time. Reverse proxies (e.g., Squid, Nginx) sit in front of application servers, serving cached responses when possible.

7. Distributed File Systems

As file volume grows, a single file server becomes insufficient. Distributed file systems such as NFS provide scalable storage across multiple nodes.

8. NoSQL and Search Engines

For massive data queries, combining NoSQL stores (MongoDB, Redis) with search engines (Lucene) yields better performance than relying solely on relational databases.

9. Business‑Level Application Splitting

When an application becomes monolithic, it is split into independent business services (e.g., news, images, search). Communication occurs via messaging or shared databases.

10. Building Distributed Services

Core services such as user, order, payment, and security are extracted into a distributed service framework; Dubbo is cited as a common solution in the Chinese e‑commerce context.

Dubbo based distributed service architecture

Conclusion

Large‑scale website architectures continuously adapt to business needs, employing a suite of techniques—caching, clustering, sharding, CDN, distributed storage, NoSQL, and micro‑service frameworks—to achieve scalability, high availability, and performance.

distributed-systems Architecture scalability Load Balancing Caching database sharding

Written by

Big Data and Microservices

Focused on big data architecture, AI applications, and cloud‑native microservice practices, we dissect the business logic and implementation paths behind cutting‑edge technologies. No obscure theory—only battle‑tested methodologies: from data platform construction to AI engineering deployment, and from distributed system design to enterprise digital transformation.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.