How to Prevent Cache Penetration, Avalanche, and Breakdown in High‑Traffic Systems
This article explains the three critical cache failure scenarios—cache penetration, cache avalanche, and cache breakdown—detailing their causes, real‑world impacts on high‑concurrency systems, and practical mitigation techniques using Redis and general backend strategies.
Background
Cache is a core component in modern software architectures. It is used to accelerate response time, increase concurrency, and reduce database load.
This article focuses on three failure scenarios that can invalidate a cache: cache penetration, cache avalanche, and cache breakdown (stampede).
Big Premise
The goal of caching is to serve requests from the fast in‑memory layer. When the cache cannot fulfil this role, traffic floods the database, potentially crashing the system under high concurrency.
Cache Penetration
Cache penetration occurs when a request queries a key that does not exist in both the cache and the underlying database. Because the missing key is never cached, every request hits the database, causing excessive load or enabling denial‑of‑service attacks.
Typical situations:
Data was deleted from both cache and database, but upstream services still request it.
Malicious actors repeatedly request non‑existent keys.
Solution 1: Cache Null or Default Values
When a database query returns no rows, store a placeholder (e.g., null or a sentinel value) in the cache with a short TTL (commonly 2–5 minutes). Refresh the placeholder when the real data appears to avoid stale nulls.
Solution 2: Pre‑validation in Business Logic
Validate request parameters early (e.g., reject negative ages, malformed IDs) so illegal requests never reach the cache layer. This reduces unnecessary cache lookups and database queries.
Solution 3: Bloom Filter Whitelist
Maintain a Bloom filter that contains all valid primary keys. Before querying the cache, check the filter; if the key is definitely absent, return an empty response without touching the database. Choose a false‑positive rate (e.g., 1 %) and allocate enough bits to keep the filter size reasonable.
Solution 4: User or IP Blacklist
Detect abusive patterns (high request rate for non‑existent keys) and temporarily block the offending user, IP, or client identifier. Integrate with rate‑limiting middleware to automate the blacklist update.
Cache Avalanche
Cache avalanche happens when a large number of hot keys expire at the same moment (or the cache service crashes), causing a sudden surge of database queries that can overwhelm the database.
Typical scenarios:
Many hot keys share the same expiration timestamp.
The cache server becomes unavailable or unresponsive.
Mitigation strategies:
Add a random jitter to each key’s TTL (e.g., TTL = base + random(0, 300) seconds) so expirations are spread over time.
Serialize cache writes using a queue or a lightweight lock when updating hot keys; be aware this may reduce write throughput.
Keep truly hot data permanently cached and update it asynchronously when strict consistency is not required.
Employ a dual‑key pattern: a primary key with a short TTL and a secondary backup key that never expires; when the primary expires, serve the backup while the primary is refreshed.
Deploy a high‑availability cache cluster (e.g., Redis Sentinel or Cluster) to survive node failures.
When an avalanche is detected, trigger circuit breaking, rate limiting, and graceful degradation to protect the database.
Cache Breakdown (Stampede)
Cache breakdown, also known as cache stampede, is a special case of avalanche where a single hot key expires. A burst of concurrent requests then all hit the database, risking overload.
Mitigation techniques:
Use a distributed mutex (e.g., Redis SETNX with an expiration) so only one request rebuilds the cache while others wait or read stale data.
Do not set an expiration for hot data; instead refresh it in the background (lazy or proactive refresh).
Implement a “pre‑emptive” lock: store an internal short‑lived marker inside the cached value. When the marker is near expiry, a background thread extends the marker and reloads the data from the database, preventing a sudden miss.
Conclusion
Cache penetration, avalanche, and breakdown share the root cause of cache unavailability, which forces traffic onto the database and can cause overload. Selecting the appropriate mitigation—null caching, request validation, Bloom filters, distributed locks, staggered TTLs, high‑availability clusters, and service‑level protections such as rate limiting and circuit breaking—helps maintain system stability under high concurrency.
Senior Brother's Insights
A public account focused on workplace, career growth, team management, and self-improvement. The author is the writer of books including 'SpringBoot Technology Insider' and 'Drools 8 Rule Engine: Core Technology and Practice'.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
