Scaling a Web System to 100M Daily Visits: Load Balancing, Caching, and Architecture
This article explains how a web system can grow from 100,000 to 100 million daily visits by introducing multi‑level caching, various load‑balancing strategies, MySQL performance tuning, distributed database setups, and geographic deployment to maintain stability and performance under massive traffic.
Web Load Balancing
Load balancing distributes work across a server cluster, protecting backend servers. Common strategies include:
HTTP redirection (302) – simple but adds latency and performs poorly at large scale.
Reverse‑proxy load balancing (e.g., Nginx) – operates at layer 7, can handle session affinity via cookies or external session stores like Redis or Memcached.
IP load balancing (LVS) – works at layer 4, offers higher performance but is more complex to configure.
DNS load balancing – maps a domain to multiple IPs; easy to set up but lacks fine‑grained control and suffers from DNS propagation delay.
DNS/GSLB (global server load balancing) – used by CDNs to direct users to the nearest IP based on geography.
Web System Cache Mechanisms
To handle growing traffic, caching must be introduced at multiple layers:
Page staticization – generate static HTML once and serve it directly, reducing dynamic processing.
In‑memory cache on a single server – e.g., PHP APC, but suffers from single‑point failure.
Dedicated memory cache services – Redis or Memcached provide fast key/value storage; Redis offers richer features.
Cache clusters – use master‑slave or Redis Cluster to avoid single‑point failures and scale cache capacity.
MySQL Internal Cache and Optimization
Key MySQL tuning steps:
Create appropriate indexes to speed up SELECTs while balancing storage and write overhead.
Enable thread cache (thread_cache_size) to reuse connections.
Consider persistent connections (pconnect) with a connection‑pool to avoid exhausting max_connections.
Set innodb_buffer_pool_size to about 80% of server memory for InnoDB buffer caching.
When data volume exceeds millions of rows, apply sharding, partitioning, or separate tables to maintain performance.
MySQL Multi‑Server Architecture
Single‑node MySQL is a single point of failure. Scaling options include:
Master‑slave replication for backup and read‑only queries.
Read‑write separation – writes go to the master, reads to slaves.
Master‑master (mutual backup) – each node acts as both master and slave, providing load distribution and fault tolerance.
Data Synchronization Between MySQL Nodes
High‑traffic environments may experience replication lag. Solutions:
MySQL 5.6+ multi‑threaded replication (per‑database basis).
Custom binlog parsers that apply multi‑threaded writes per table, suitable when tables are independent.
Cache Between Web Servers and Databases
Beyond database caching, a cache layer between web servers and DB reduces read pressure:
Static content caching on disk.
Single‑node in‑memory cache (Redis/Memcached).
Cache clusters (Redis Cluster) to avoid single‑point failures and increase hit rates.
Write‑through or write‑back strategies: buffer write operations in cache, batch them, and apply to the DB periodically.
NoSQL Storage
For extremely hot data, offload to a NoSQL key‑value store such as Redis, which can also persist to disk, further relieving MySQL.
Handling Empty‑Node Queries
Cache misses for non‑existent data can waste resources. Store a mapping of existing keys in cache to filter out empty queries early.
Geographic Deployment (Distributed Nodes)
To reduce latency for distant users, adopt a core‑centered, node‑distributed architecture:
Core services remain centralized in a strategically located data center.
Regional nodes host replicated services closer to users.
Implement node disaster recovery by failing over to nearby nodes.
Apply overload protection by rejecting excess connections or diverting traffic to less‑loaded nodes.
Author: 徐汉彬 (@Hansion徐汉彬)
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
