How to Prevent Redis Cache Avalanche and Penetration Failures
The article explains what cache avalanche and cache penetration are in Redis, why they can crash databases, and provides practical strategies—including random expiration, high‑availability setups, local caches, rate limiting, Bloom filters, and empty‑object caching—to mitigate these issues and maintain cache‑database consistency.
Cache Avalanche
A cache avalanche occurs when a large number of cached keys expire simultaneously or the Redis service becomes unavailable, causing all incoming requests to fall back to the underlying database. The sudden surge of database traffic can overwhelm the database and lead to service outage.
Mitigation techniques :
Assign each key a random offset (e.g., expireTime = baseTTL + random(0, maxJitter)) so that expirations are staggered.
Deploy Redis in a high‑availability configuration before a failure occurs:
Master‑slave replication with Sentinel for automatic failover.
Redis Cluster for sharding and fault tolerance.
When Redis is down, use a local in‑process cache (e.g., Ehcache) combined with request‑level rate limiting (e.g., Hystrix) to protect the database.
Enable Redis persistence (RDB/AOF). After a restart, Redis reloads data from disk, allowing rapid cache recovery.
Cache Penetration
Cache penetration happens when requests query data that does not exist in the system. Because the cache misses and the application does not cache the “null” result, every request hits the database, reproducing the avalanche effect.
Mitigation techniques :
Place a probabilistic filter such as a Bloom filter in front of the database to reject obviously invalid keys before they reach Redis.
When a database lookup returns no record, write a short‑lived placeholder (e.g., an empty object) into Redis so that subsequent identical requests are served from the cache. The placeholder’s TTL should be relatively small (seconds to minutes) to avoid long‑term staleness.
Cache‑Database Dual‑Write Consistency
Read flow :
function get(key):
value = redis.get(key)
if value is not None:
return value
value = db.query(key)
if value is not None:
redis.set(key, value, ttl=DEFAULT_TTL)
return valueConsistency problem : When the application updates data, the write may be applied to the database but not to the cache (or vice‑versa), leading to stale reads.
Strategies to reduce inconsistency windows :
Use write‑through (write to cache first, then propagate to the database) or write‑behind (write to cache and asynchronously flush to the database).
Explicitly invalidate or delete the cache entry immediately after a successful database update.
Attach a version number or timestamp to cached objects; on read, compare the version with the database and refresh the cache if they differ.
Set an appropriate TTL for cached entries so that even if an invalidation is missed, the stale entry expires and is refreshed from the database.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
