How to Stop Cache Penetration: 4 Effective Solutions with Java Example
This article explains what cache penetration is, why it occurs, and presents four practical mitigation techniques—including storing empty values, using Bloom filters, setting appropriate expiration times, and preloading hot data—complete with Java code examples.
What Is Cache Penetration
Cache penetration occurs when malicious or erroneous queries repeatedly request data that does not exist in the cache, forcing the system to query the underlying database each time and potentially causing overload or downtime.
Causes of Cache Penetration
Typical reasons include:
Malicious queries : Attackers deliberately request non‑existent keys to waste resources.
High query frequency : A sudden burst of requests for rarely accessed data can bypass the cache.
Rapid data changes : Long cache TTL combined with frequent updates leads to stale entries that must be refreshed.
Random key queries : Systems that generate random keys may often miss cached entries.
Solutions to Cache Penetration
1. Cache empty values or error markers
Store a placeholder (e.g., a special string) for missing data so subsequent requests hit the cache and avoid database lookups.
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
public class CacheManager {
private Map<String, String> cache = new ConcurrentHashMap<>();
public String getUserInfo(String userId) {
String cachedData = cache.get(userId);
if (cachedData != null) {
if (cachedData.equals("null")) {
// Return empty value, avoid repeated query
return null;
} else {
return cachedData;
}
}
// Data not in cache, query database
String userData = queryDatabase(userId);
if (userData == null) {
// User does not exist, cache empty marker with short TTL
cache.put(userId, "null");
} else {
cache.put(userId, userData);
}
return userData;
}
private String queryDatabase(String userId) {
// Simulate database query
return null; // null indicates user does not exist
}
}2. Use a Bloom filter
A Bloom filter quickly checks whether a key might exist in the cache, filtering out impossible queries before they reach the database.
The filter works in three steps:
Three hash functions map an element to bits in a bit array, setting those bits to 1.
When checking an element, if all three bits are 1, the filter reports the element may exist.
If any bit is 0, the filter reports the element definitely does not exist.
3. Set appropriate cache expiration times
Adjust TTL based on data access patterns: short TTL for hot data, longer TTL for cold data, to reduce stale entries that cause penetration.
4. Preload hot data
Load frequently accessed data into the cache ahead of time, using periodic refresh or cache warm‑up techniques, to lower database load.
Combining these methods effectively mitigates cache penetration, improves system performance, and reduces database pressure.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Mike Chen's Internet Architecture
Over ten years of BAT architecture experience, shared generously!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
