Scaling Web Systems to 100M Daily Visits: Load Balancing and Caching
This article explains how a web system can evolve from handling 100,000 daily visits to over 100 million by progressively implementing multi‑level caching, various load‑balancing techniques—including HTTP redirects, reverse‑proxy, IP‑level, DNS, and GSLB—and optimizing MySQL through indexing, connection pooling, sharding, replication, and integrating memory caches such as Redis to ensure high performance and reliability.
When a web system grows from 100k daily visits to 1 billion, the pressure on the system increases dramatically, requiring multi‑level caching and architectural upgrades at each stage.
Web Load Balancing
Load balancing distributes work across a server cluster, protecting backend servers.
1. HTTP Redirection
Servers return a 302 redirect (Location header) to point the client to a nearer URL, e.g., for downloading PHP source packages. This method is easy to implement but performs poorly under high traffic and adds latency.
2. Reverse‑Proxy Load Balancing
Reverse proxies (e.g., Nginx) operate at the application layer (layer 7) and forward HTTP requests. Session affinity can be achieved by routing the same user to the same backend, or by storing sessions in an external service such as Redis or Memcached. Reverse proxies can also cache content, though they introduce a single‑point‑of‑failure risk.
3. IP Load Balancing
IP‑level (layer 4) load balancing modifies packet destination IP/port, offering higher performance. Common implementations include LVS (Linux Virtual Server) with IPVS. Variants such as LVS‑NAT, LVS‑DR, and LVS‑TUN differ in packet handling.
4. DNS Load Balancing
DNS can map a single domain to multiple IP addresses, providing simple, high‑performance distribution but limited rule flexibility and potential DNS propagation delays.
5. DNS/GSLB Load Balancing
CDN‑style Global Server Load Balancing (GSLB) returns IPs based on geographic proximity, reducing routing hops for users.
Web System Cache Mechanism Establishment and Optimization
Beyond external load balancing, internal caching is essential because 80% of requests target 20% of hot data.
1. MySQL Internal Caching
Proper indexing to speed up SELECTs.
Thread pool cache (thread_cache_size) to reuse connections.
InnoDB buffer pool (innodb_buffer_pool_size) typically set to ~80% of RAM on dedicated MySQL servers.
Sharding, partitioning, or splitting tables when data exceeds millions of rows.
2. MySQL Multi‑Server Deployment
Master‑slave replication for failover.
Read‑write separation: writes to master, reads from slaves.
Master‑master (mutual backup) for load distribution and redundancy.
3. MySQL Data Synchronization
MySQL 5.6+ multi‑threaded binlog replication (per‑database granularity).
Custom binlog parsers for multi‑threaded writes on a per‑table basis.
4. Caching Between Web Servers and Databases
Page staticization: cache generated HTML on disk.
Single‑node memory cache (Redis or Memcached) for hot data.
Memory cache clusters (e.g., Redis Cluster) to avoid single‑point failures.
Batch write‑back: queue modifications in cache, flush to DB periodically.
Adjust MySQL write‑ahead settings (innodb_flush_log_at_trx_commit) and use faster storage (SSD, RAID).
Introduce NoSQL (Redis) for high‑frequency read/write data.
5. Empty‑Node Query Mitigation
Cache a mapping of existing records to filter out requests for non‑existent data early, reducing unnecessary DB lookups.
Geographic Distributed Deployment
Core services remain centralized while nodes are dispersed across regions to reduce latency.
1. Core‑Central, Node‑Distributed
Critical data stays in a central location; peripheral services are replicated in multiple cities (e.g., Shanghai core, Beijing, Shenzhen, Wuhan nodes).
2. Node Disaster Recovery and Overload Protection
Failover to nearby regional nodes when a node fails.
Overload protection: reject new connections or divert traffic to other nodes.
Conclusion
Web systems grow from a single server to massive clusters, and each growth stage introduces new challenges that are solved by adding layers of load balancing, caching, database optimization, and geographic distribution. Optimization is an ongoing process as technology evolves.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITFLY8 Architecture Home
ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
