Avoid Common Pitfalls in Distributed Caching with Redis and Memcached

This article analyzes the characteristics of Redis and Memcached, explains typical design mistakes such as inconsistent hashing, cache avalanche, penetration, and hot‑key issues, and provides practical solutions like consistent hashing, binlog‑driven cache invalidation, key versioning, and distributed locking to improve cache reliability and performance.

Code Ape Tech Column
Code Ape Tech Column
Code Ape Tech Column
Avoid Common Pitfalls in Distributed Caching with Redis and Memcached

Preface

In high‑concurrency systems we often use Redis or Memcached as distributed caches to reduce database pressure, but improper cache design can cause serious problems. This article collects common pitfalls and corresponding solutions.

1. Server Characteristics of the Two Caches

1.1 Memcached

Memcached (referred to as mc) has no built‑in clustering; the client handles data distribution. The example uses xmemcached, which supports multiple hash strategies. By default it uses simple modulo of key and instance count, which leads to massive key invalidation when nodes are added or removed, causing a cache avalanche.

To mitigate this, the client enables consistent hashing (Ketama) for data sharding:

XMemcachedClientBuilder builder = new XMemcachedClientBuilder(AddrUtil.getAddresses(servers));
builder.setOpTimeout(opTimeout);
builder.setConnectTimeout(connectTimeout);
builder.setTranscoder(transcoder);
builder.setConnectionPoolSize(connectPoolSize);
builder.setKeyProvider(keyProvider);
builder.setSessionLocator(new KetamaMemcachedSessionLocator()); // enable Ketama consistent hashing

Consistent hashing limits the impact of node changes to a small subset of data, and virtual nodes improve data distribution.

Memcached is multithreaded; each value can store up to 1 MB. Expired k‑v pairs are removed on the next access, and if a key is never accessed again it remains in memory until LRU eviction.

1.2 Redis

Redis offers a cluster mode where the server handles key routing. It provides master‑slave high availability and uses hash slots (16384 slots) for sharding instead of consistent hashing. The slot for a key is calculated by CRC16(key) % 16384. Adding or removing nodes triggers slot reallocation.

When a node is added, its slot range shrinks and the displaced slots are migrated to the new node. The migration is performed in batches of keys.

Redis slot migration diagram
Redis slot migration diagram

2. Cache Structure Selection

Memcached provides simple k‑v storage with a 1 MB limit per value, suitable for plain text data. Redis offers rich data structures (hashes, sorted sets, etc.) that are better for queries, sorting, and pagination, while still using Memcached for the detailed payload.

A typical pattern is to store the primary identifier (e.g., id) in Redis, retrieve the full object from Memcached, and let Redis handle ranking, pagination, and sorting.

Redis leaderboard structure
Redis leaderboard structure

3. Redis Large‑Index Backfill Issue

When a cached index expires, rebuilding it from a large data set can be slow. Using a message queue to rebuild the index page‑by‑page reduces the impact on the database.

Backfill via message queue
Backfill via message queue

4. Consistency Problems

Typical cache‑DB consistency flow: after a DB write, the service deletes the related cache key; a read‑only service then experiences a cache miss, reads from the DB, and writes the fresh value back to the cache.

Cache invalidation flow
Cache invalidation flow

Concurrent reads and writes can cause stale data to be written back to the cache, leading to inconsistency.

4.1 Concurrent Read/Write Inconsistency

Concurrent read/write diagram
Concurrent read/write diagram

Multiple threads may delete the cache at different times, causing one thread to write stale data back while another thread has already updated the DB.

4.2 Master‑Slave Sync Delay

Master‑slave delay diagram
Master‑slave delay diagram

If the read side uses a replica that lags behind the master, a cache miss may read stale data before the binlog‑driven invalidation occurs.

4.3 Cache Pollution

Cache pollution flow
Cache pollution flow

Changing the cache schema (e.g., adding a new field) without versioning can cause pre‑release or gray‑release data to pollute production cache.

5. How to Deal with Consistency Issues

5.1 Binlog + Message Queue + Consumer Delete Cache

Binlog‑based cache invalidation
Binlog‑based cache invalidation

Each table change generates a binlog event, which is captured by Canal and pushed to a message queue. Consumers then delete the corresponding cache keys, guaranteeing ordered invalidation.

5.2 From‑Slave Binlog

Slave‑side binlog
Slave‑side binlog

Listening to the slave’s binlog ensures that the cache and both master and slave stay consistent, though multi‑slave setups require careful handling of multiple binlog streams.

5.3 Key Version Upgrade

When a cache structure changes, append a version suffix (e.g., _v2) to the key to avoid mixing old and new formats.

6. Hit‑Rate Problems

Frequent data changes generate many binlog messages, causing rapid cache invalidation and a drop in hit rate. One mitigation is to let the binlog consumer directly refresh the cache instead of deleting the key, ensuring that the latest data is always available.

To preserve order in a multithreaded consumer, group messages by a stable identifier (e.g., id) and process each group in a single thread.

7. Cache Penetration

7.1 What Is It?

Requests with parameters that do not exist in the database repeatedly hit the DB because the cache returns a miss each time.

7.2 Mitigation

Cache empty results with a short TTL, and optionally filter obviously invalid IDs (e.g., IDs far beyond the expected range) before querying the cache.

8. Cache Breakdown (Stampede)

8.1 What Is It?

When a hot key expires, a flood of concurrent requests bypass the cache and hit the DB, overwhelming it.

8.2 Solutions

Add a mutex inside the backfill method to allow only one thread to rebuild the cache.

Use a distributed lock for the same purpose across multiple service instances, acknowledging the added complexity.

9. Cache Avalanche

9.1 What Is It?

A large number of keys expire simultaneously, causing a massive DB load.

9.2 Prevention

Configure high‑availability for both Memcached and Redis (e.g., Redis Cluster with master‑slave).

Use consistent hashing (or virtual nodes) to reduce impact of scaling events.

Apply back‑source rate limiting so that when the cache is down, DB traffic is throttled.

10. Hot‑Key Problem

10.1 What Is It?

A single hot key concentrates traffic on one cache node, potentially causing a node failure.

10.2 Mitigation

Maintain multiple replicas of the hot key across different cache nodes and route requests randomly among them.

Introduce a short‑lived local cache layer for the hot key to absorb spikes.

Hot‑key mitigation
Hot‑key mitigation
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

RedisCache Consistencydistributed cachecache avalanchecache penetrationMemcachedHot Key
Code Ape Tech Column
Written by

Code Ape Tech Column

Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.