Why Modern Web Apps Need Caching: From Basics to Distributed Strategies
This article explains the fundamentals of caching, why it is essential for high‑traffic web services, compares cache components such as Memcached and Redis, and details design patterns, scalability, high‑availability, and operational practices for building robust distributed cache systems.
Cache Overview
Cache is a hardware or software component that stores data to enable faster subsequent access. Typical examples include CPU cache, disk cache, and, in web applications, cache components such as Memcached and Redis.
Why Introduce Cache
When traffic grows, a single database cannot meet performance and cost requirements. Adding a cache layer improves read speed, system scalability, and reduces storage cost. Because about 80% of requests target 20% of hot data, caching hot data dramatically increases overall capacity.
Figure 1: Operation latency comparison (memory vs. disk).
Figure 2: Performance, capacity, and price of different storage media.
Common Cache Types
Caches can be divided into two categories: in‑process caches (e.g., simple maps, EHCache) and external cache components (e.g., Memcached, Redis).
Memcached Overview
Memcached is an open‑source, high‑performance, distributed key‑value store designed to accelerate websites and reduce database load.
High‑performance key‑value storage
Simple text and binary protocols
Data expiration support
LRU eviction algorithm
Multithreaded architecture
Slab memory management
Client‑side distributed implementation
Memory Management – Slab Allocator
Memcached uses a slab allocator that pre‑divides memory into fixed‑size chunks to reduce fragmentation.
Figure 4: Slab allocator mechanism.
Each slab class groups chunks of a specific size; when allocating, Memcached picks the class that best fits the data size.
Figure 5: Slab class selection.
Slab Waste and Eviction
Waste can occur at the chunk, page, or slab level. Memcached evicts items using an LRU algorithm confined to each slab, which can lead to "slab calcification" when a slab’s size no longer matches the data distribution.
Redis Overview
Redis is an open‑source, high‑performance, distributed cache that supports multiple data structures (string, list, hash, set, sorted set, HyperLogLog) and offers features such as data expiration, various LRU policies, memory allocators (tcmalloc, jemalloc), persistence (RDB, AOF), master‑slave replication, and single‑threaded execution.
Distributed Cache Implementation
Data Sharding
Data sharding distributes data across multiple instances using range, hash, or slot partitioning. Hash sharding can be static (modulo) or consistent. Static hashing is simple but suffers from high re‑balancing cost; consistent hashing reduces node‑addition/removal impact but may cause temporary inconsistency.
Sharding Implementation Models
Client‑side sharding (e.g., Memcached, Redis 2.x)
Proxy‑side sharding (e.g., twemproxy, Codis, internal CacheService)
Server‑side sharding (e.g., Redis 3.x, Cassandra)
High Availability
To avoid cache‑miss storms when a node fails, a master‑slave architecture is used so that the system remains available even if a master goes down.
Figure 6: Master/Slave cache model.
Scalability – Multi‑Level Cache
To handle sudden traffic spikes, an L1 cache layer (smaller, hotter) is added in front of the master/slave layer. Multiple L1 groups can be scaled horizontally.
Figure 7: L1 cache architecture.
Figure 8: Master as an L1 cache group.
Cache Design Practices
Multiget Hole : When many nodes are involved, multiget latency is limited by the slowest node; limit node count to 4‑8 or use replication.
Reverse Cache : Store a null value for non‑existent keys to prevent DB penetration.
Fail‑Fast : Mark failing nodes as unavailable after N consecutive timeouts and return errors quickly.
No Expiration (Cache‑as‑Storage) : Keep all data in memory for extremely hot workloads, accepting higher memory cost.
Dog‑Pile Effect : Prevent massive DB load when a hot key expires by ensuring only one request repopulates the cache.
Hotspot Handling : Deploy multiple small L1 caches to absorb sudden spikes.
Avalanche Prevention : Ensure cache high availability, apply degradation and flow control, and monitor resource capacity.
Data Consistency : Adopt eventual consistency between master and replicas, cache and storage, and across business dimensions.
Capacity Planning : Consider request volume, hit rate, network bandwidth, storage size, and connection limits.
Weibo Cache Service (CacheService)
Weibo built an internal cache‑service platform to address resource management complexity, high‑availability reuse, operational difficulty, and SLA monitoring. The architecture includes a stateless proxy layer, a resource layer (initially Memcached, later Redis and SSDCache), client configuration, a ConfigServer for dynamic settings, and a ClusterManager for lifecycle and SLA management.
Figure 9: CacheService architecture.
The system continues to evolve, aiming to improve cold‑hot data tiering, reduce service cost, and simplify cluster management.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITFLY8 Architecture Home
ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
