Designing Scalable Website Architecture: 5 Essential Layers Explained
This article outlines a five‑layer website architecture—web cache, load balancer, web server, file server, and database—illustrated with high‑traffic e‑commerce, ad, and CDN portals, and provides practical recommendations on software choices, high‑availability setups, and performance optimization for each layer.
Website architecture is typically divided into five layers: web cache, load balancing, web, file server, and database. The discussion uses three high‑concurrency production environments as examples: an e‑commerce site (peak 2,900 concurrent, ~5 M daily PV), an ad site (peak 1,500 concurrent, ~1.5 M daily PV), and a large CDN portal ad site (peak 5,000 concurrent, ~50 M daily PV).
Web Cache Layer
CDN rental services (e.g., Kuaiwang, Lanxun, Alibaba, Tencent) generally outperform self‑deployed Squid or Varnish in cost and coverage. Building a custom CDN is labor‑intensive and often ineffective, so planning should occur early in the site lifecycle. Open‑source options include traditional Squid, Cache, and increasingly Nginx or Varnish. Nginx already provides web‑cache acceleration, utilizes multi‑core CPUs better than Squid, and many architects use it simultaneously as a load balancer and cache server.
Load Balancing Layer
Common hardware/software solutions are F5, LVS/HAProxy, and Nginx. F5/LVS are widely deployed worldwide; Taobao has migrated from F5 to LVS. HAProxy + Keepalived delivers strong throughput and stability in production, and Taobao also promotes HAProxy usage. Nginx + Keepalived has proven stable in many production environments; it can serve as a layer‑7 proxy behind front‑end F5/LVS when traffic spikes.
Web Layer
Most high‑traffic sites now use Nginx as the web application server because of its superior concurrency handling. In one portal site, a single Nginx instance sustained over 10,000 concurrent requests during peak periods. Linux clusters provide excellent scalability: adding more Nginx nodes easily accommodates traffic beyond 10,000 concurrent users.
File Server Layer
Four typical deployment options are:
Single NFS with backup NFS – easy to maintain but suffers a single point of failure.
DRBD + HeartBeat + NFS high‑availability – eliminates single points of failure but may become pressured as traffic grows.
Distributed file systems such as MFS or GlusterFS – MFS is user‑friendly, stable, efficient for massive small files, and newer versions resolve the master‑server single‑point issue.
Custom‑built distributed file systems for ultra‑large enterprises (e.g., Taobao, Tencent).
Database Layer
Database performance becomes the bottleneck under high PV. Large CDN ad sites often use Oracle RAC for high availability, albeit at a high cost. For MySQL‑based sites, recommended practices include:
Introduce a memcached layer to cache frequently accessed data.
Use RAID10 disks or high‑performance SSDs to mitigate I/O bottlenecks.
Adopt a master‑slave architecture with read/write separation; LVS is preferred over HAProxy for scaling beyond ten MySQL nodes.
Implement sharding by business domain (e.g., Web, BBS, Blog) to distribute load.
Collaborate with DBAs to fine‑tune MySQL parameters, optimize SQL statements, and partition data.
Future posts will cover MySQL hardware selection, installation methods, configuration tuning, SQL optimization, status monitoring, slow‑query handling, table optimization, and high‑availability extensions.
Website Architecture Focus Summary
Deploy robust DDoS/CC protection and select cost‑effective firewalls when hosting in an IDC.
Design business logic and code architecture efficiently; poor design can nullify hardware advantages.
While Apache can handle significant concurrency, Nginx generally offers better performance for high‑traffic sites; 2,000 concurrent requests are already substantial.
DRBD + HeartBeat + NFS works for early stages, but as traffic grows, consider Nginx as a middle proxy, Squid clusters, or CDN acceleration.
Use Nginx as a middle proxy for static content (HTML, JPG, PNG, CSS) and forward dynamic requests to backend PHP/Tomcat clusters, achieving static/dynamic separation.
MySQL remains the most pressure‑prone component; continuous optimization of hardware, configuration, and queries is essential.
System and website construction, operation, and debugging require coordinated effort among developers, system engineers, DBAs, and testers.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
