Cache Penetration, Cache Breakdown, and Cache Avalanche: Causes and Solutions
This article explains cache penetration, cache breakdown, and cache avalanche, describing their causes, differences, and practical mitigation strategies such as parameter validation, caching null keys, Bloom filters, rate limiting, hot‑key pre‑warming, and distributed locking, with Java code examples.
Cache Penetration
What is cache penetration?
Cache penetration occurs when a large number of requests contain keys that are invalid – they do not exist in the cache nor in the database. These requests bypass the cache layer and hit the database directly, putting huge pressure on it and potentially causing a crash.
Typical scenario: an attacker generates many illegal keys, causing all requests to miss both cache and database.
How to solve it?
1) Validate parameters early; reject illegal requests (e.g., negative IDs, malformed email formats) with an error response.
2) Cache null keys: if a key is missing in both cache and database, store a placeholder in Redis with a short TTL. SET key value EX 10086 This works for keys that change infrequently. For malicious attacks that generate many distinct keys, keep the TTL short (e.g., 1 minute) to avoid polluting the cache.
Typical key format: table:column:primaryKeyName:primaryKeyValue
Java example:
public Object getObjectInclNullById(Integer id) {
// Get data from cache
Object cacheValue = cache.get(id);
// Cache miss
if (cacheValue == null) {
// Get from database
Object storageValue = storage.get(key);
// Cache the result (including null)
cache.set(key, storageValue);
// If the database returns null, set a short expiration to avoid attack
if (storageValue == null) {
cache.expire(key, 60 * 5);
}
return storageValue;
}
return cacheValue;
}2) Bloom Filter
A Bloom filter is a space‑efficient probabilistic data structure that can quickly test whether an element possibly exists in a large set. It uses a bit array and multiple hash functions. While it may produce false positives, it never yields false negatives.
By storing all legitimate keys in a Bloom filter, a request can be rejected early if the key is definitely not present, preventing unnecessary database hits.
Implementation steps:
Populate the Bloom filter with all valid keys.
When a request arrives, check the filter; if the key is absent, return an error immediately.
If the key is possibly present, proceed with normal cache‑database flow.
3) Interface Rate Limiting
Apply rate limiting per user or IP address, and optionally maintain a blacklist for abusive IPs.
Cache Breakdown (Cache Stampede)
What is cache breakdown?
Cache breakdown happens when a hot key (frequently accessed data) expires in the cache. The key still exists in the database, but because it is missing from the cache, a sudden surge of requests hits the database simultaneously, overwhelming it.
Solutions
Set hot data to never expire or give it a long TTL.
Pre‑warm hot data by loading it into the cache before it expires (e.g., during a flash‑sale).
Use a mutex/lock: before querying the database and repopulating the cache, acquire a lock so only one request performs the expensive operation.
Cache Avalanche
What is cache avalanche?
A cache avalanche occurs when a large portion or all of the cached data expires at the same time, causing a massive wave of requests to hit the database simultaneously, similar to an avalanche of snow.
It can also be triggered by a cache service outage.
Solutions
For Redis service unavailability:
Deploy a Redis cluster to avoid single‑node failures.
Apply rate limiting to smooth traffic spikes.
Use multi‑level caching (e.g., local cache + Redis) so that a fallback cache is available.
For hot‑key expiration:
Assign varied or random TTLs to keys.
Avoid setting keys to never expire unless absolutely necessary.
Pre‑warm caches by loading hot data at startup or via scheduled jobs.
Common pre‑warming methods:
Scheduled tasks (e.g., xxl‑job) that query hot data from the database and write it to the cache.
Message queues (e.g., Kafka) that push primary keys of hot data; cache services consume the messages and refresh the cache.
Key differences:
Cache penetration: key missing in both cache and database.
Cache breakdown: key exists in database but is missing in cache (hot data).
Cache avalanche: massive or total cache expiration causing a flood of database requests.
IT Services Circle
Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.