Avoid Common Pitfalls in Distributed Caching with Redis and Memcached
This article analyzes the characteristics of Redis and Memcached, explains typical design mistakes such as inconsistent hashing, cache avalanche, penetration, and hot‑key issues, and provides practical solutions like consistent hashing, binlog‑driven cache invalidation, key versioning, and distributed locking to improve cache reliability and performance.
Preface
In high‑concurrency systems we often use Redis or Memcached as distributed caches to reduce database pressure, but improper cache design can cause serious problems. This article collects common pitfalls and corresponding solutions.
1. Server Characteristics of the Two Caches
1.1 Memcached
Memcached (referred to as mc) has no built‑in clustering; the client handles data distribution. The example uses xmemcached, which supports multiple hash strategies. By default it uses simple modulo of key and instance count, which leads to massive key invalidation when nodes are added or removed, causing a cache avalanche.
To mitigate this, the client enables consistent hashing (Ketama) for data sharding:
XMemcachedClientBuilder builder = new XMemcachedClientBuilder(AddrUtil.getAddresses(servers));
builder.setOpTimeout(opTimeout);
builder.setConnectTimeout(connectTimeout);
builder.setTranscoder(transcoder);
builder.setConnectionPoolSize(connectPoolSize);
builder.setKeyProvider(keyProvider);
builder.setSessionLocator(new KetamaMemcachedSessionLocator()); // enable Ketama consistent hashingConsistent hashing limits the impact of node changes to a small subset of data, and virtual nodes improve data distribution.
Memcached is multithreaded; each value can store up to 1 MB. Expired k‑v pairs are removed on the next access, and if a key is never accessed again it remains in memory until LRU eviction.
1.2 Redis
Redis offers a cluster mode where the server handles key routing. It provides master‑slave high availability and uses hash slots (16384 slots) for sharding instead of consistent hashing. The slot for a key is calculated by CRC16(key) % 16384. Adding or removing nodes triggers slot reallocation.
When a node is added, its slot range shrinks and the displaced slots are migrated to the new node. The migration is performed in batches of keys.
2. Cache Structure Selection
Memcached provides simple k‑v storage with a 1 MB limit per value, suitable for plain text data. Redis offers rich data structures (hashes, sorted sets, etc.) that are better for queries, sorting, and pagination, while still using Memcached for the detailed payload.
A typical pattern is to store the primary identifier (e.g., id) in Redis, retrieve the full object from Memcached, and let Redis handle ranking, pagination, and sorting.
3. Redis Large‑Index Backfill Issue
When a cached index expires, rebuilding it from a large data set can be slow. Using a message queue to rebuild the index page‑by‑page reduces the impact on the database.
4. Consistency Problems
Typical cache‑DB consistency flow: after a DB write, the service deletes the related cache key; a read‑only service then experiences a cache miss, reads from the DB, and writes the fresh value back to the cache.
Concurrent reads and writes can cause stale data to be written back to the cache, leading to inconsistency.
4.1 Concurrent Read/Write Inconsistency
Multiple threads may delete the cache at different times, causing one thread to write stale data back while another thread has already updated the DB.
4.2 Master‑Slave Sync Delay
If the read side uses a replica that lags behind the master, a cache miss may read stale data before the binlog‑driven invalidation occurs.
4.3 Cache Pollution
Changing the cache schema (e.g., adding a new field) without versioning can cause pre‑release or gray‑release data to pollute production cache.
5. How to Deal with Consistency Issues
5.1 Binlog + Message Queue + Consumer Delete Cache
Each table change generates a binlog event, which is captured by Canal and pushed to a message queue. Consumers then delete the corresponding cache keys, guaranteeing ordered invalidation.
5.2 From‑Slave Binlog
Listening to the slave’s binlog ensures that the cache and both master and slave stay consistent, though multi‑slave setups require careful handling of multiple binlog streams.
5.3 Key Version Upgrade
When a cache structure changes, append a version suffix (e.g., _v2) to the key to avoid mixing old and new formats.
6. Hit‑Rate Problems
Frequent data changes generate many binlog messages, causing rapid cache invalidation and a drop in hit rate. One mitigation is to let the binlog consumer directly refresh the cache instead of deleting the key, ensuring that the latest data is always available.
To preserve order in a multithreaded consumer, group messages by a stable identifier (e.g., id) and process each group in a single thread.
7. Cache Penetration
7.1 What Is It?
Requests with parameters that do not exist in the database repeatedly hit the DB because the cache returns a miss each time.
7.2 Mitigation
Cache empty results with a short TTL, and optionally filter obviously invalid IDs (e.g., IDs far beyond the expected range) before querying the cache.
8. Cache Breakdown (Stampede)
8.1 What Is It?
When a hot key expires, a flood of concurrent requests bypass the cache and hit the DB, overwhelming it.
8.2 Solutions
Add a mutex inside the backfill method to allow only one thread to rebuild the cache.
Use a distributed lock for the same purpose across multiple service instances, acknowledging the added complexity.
9. Cache Avalanche
9.1 What Is It?
A large number of keys expire simultaneously, causing a massive DB load.
9.2 Prevention
Configure high‑availability for both Memcached and Redis (e.g., Redis Cluster with master‑slave).
Use consistent hashing (or virtual nodes) to reduce impact of scaling events.
Apply back‑source rate limiting so that when the cache is down, DB traffic is throttled.
10. Hot‑Key Problem
10.1 What Is It?
A single hot key concentrates traffic on one cache node, potentially causing a node failure.
10.2 Mitigation
Maintain multiple replicas of the hot key across different cache nodes and route requests randomly among them.
Introduce a short‑lived local cache layer for the hot key to absorb spikes.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Code Ape Tech Column
Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
