Understanding Cache Penetration, Breakdown, and Avalanche and Their Solutions with Redis and Bloom Filters
The article explains why high‑traffic scenarios cause cache penetration, breakdown, and avalanche, and demonstrates practical solutions using Redis caching, empty‑object handling, Bloom filters, and locking techniques with detailed Java code examples.
In everyday development, databases are used for data storage, and under normal load they work fine, but when a large amount of data is accessed in a short time—such as flash sales or sudden spikes in homepage traffic—the disk‑oriented read/write speed becomes a serious bottleneck, potentially causing database crashes and service downtime.
To mitigate this, projects often introduce NoSQL technologies like in‑memory databases with persistence, among which Redis is a popular choice that greatly improves query performance.
However, caching introduces consistency challenges; if strict data consistency is required, caching may not be suitable.
Typical cache problems include cache penetration , cache breakdown , and cache avalanche . The article provides practical code‑based solutions for each.
Cache Penetration
Cache penetration occurs when a request queries data that is absent in both the cache and the database, causing repeated database hits. Two solutions are presented:
Cache empty objects – simple but less effective.
Use a Bloom filter – more complex to implement but highly effective.
Cache Empty Object
When a request finds no data, the system stores a placeholder (empty object) in the cache to avoid subsequent database queries. The implementation code is:
public class UserServiceImpl {
@Autowired
UserDAO userDAO;
@Autowired
RedisCache redisCache;
public User findUser(Integer id) {
Object object = redisCache.get(Integer.toString(id));
// If cache hit
if (object != null) {
if (object instanceof NullValueResultDO) {
return null;
}
return (User) object;
} else {
// Cache miss, query DB
User user = userDAO.getUser(id);
if (user != null) {
redisCache.put(Integer.toString(id), user);
} else {
// Store empty object with short TTL
redisCache.put(Integer.toString(id), new NullValueResultDO(), 60);
}
return user;
}
}
}Bloom Filter
A Bloom filter is a probabilistic data structure that quickly determines whether an element may exist in a set, offering fast query speed and low memory usage at the cost of a false‑positive rate and difficulty of deletion.
The core ideas are a large binary bit array and multiple hash functions. The filter cannot produce false negatives, but may produce false positives.
Implementation code for a simple Bloom filter:
public class MyBloomFilter {
private static final int SIZE = 2 << 10; // bit array size
private static final int[] num = {5, 19, 23, 31, 47, 71}; // hash seeds
private BitSet bits = new BitSet(SIZE);
private MyHash[] function = new MyHash[num.length];
public MyBloomFilter() {
for (int i = 0; i < num.length; i++) {
function[i] = new MyHash(SIZE, num[i]);
}
}
public void add(String value) {
for (MyHash f : function) {
bits.set(f.hash(value), true);
}
}
public boolean contains(String value) {
if (value == null) return false;
boolean result = true;
for (MyHash f : function) {
result = result && bits.get(f.hash(value));
}
return result;
}
}
class MyHash {
private int cap;
private int seed;
public MyHash(int cap, int seed) { this.cap = cap; this.seed = seed; }
public int hash(String value) {
int result = 0;
for (int i = 0; i < value.length(); i++) {
result = seed * result + value.charAt(i);
}
return (cap - 1) & result;
}
}In production, libraries such as Google Guava provide ready‑made Bloom filter implementations:
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>27.0.1-jre</version>
</dependency>Cache Breakdown
Cache breakdown happens when a hot key expires and a massive number of concurrent requests directly hit the database, overwhelming it. The common solution is to add a lock around the cache‑miss path so that only the first request loads data from the DB and populates the cache.
Example of a single‑machine lock using synchronized :
public String getProduceNum(String key) {
try {
synchronized (this) {
int num = Integer.parseInt(redisTemplate.opsForValue().get(key));
if (num > 0) {
redisTemplate.opsForValue().set(key, (num - 1) + "");
System.out.println("Remaining stock: " + (num - 1));
} else {
System.out.println("Stock is 0");
}
}
} catch (NumberFormatException e) {
e.printStackTrace();
}
return "OK";
}For distributed environments, a distributed lock (e.g., Redisson RLock ) is used:
public String getProduceNum(String key) {
RLock lock = redissonClient.getLock(key);
try {
int num = Integer.parseInt(redisTemplate.opsForValue().get(key));
lock.lock();
if (num > 0) {
redisTemplate.opsForValue().set(key, (num - 1) + "");
System.out.println("Remaining stock: " + (num - 1));
} else {
System.out.println("Stock already 0");
}
} catch (NumberFormatException e) {
e.printStackTrace();
} finally {
lock.unlock();
}
return "OK";
}Cache Avalanche
A cache avalanche occurs when many keys expire simultaneously, causing a sudden surge of traffic to the database. Causes include Redis server failure or mass key expiration.
Typical mitigation strategies are:
Deploy a highly available Redis cluster to avoid single‑node failures.
Assign different TTLs to keys so they do not expire at the same moment.
In practice, you also need to consider LRU eviction, RDB/AOF persistence, and other cache‑related concerns such as capacity and data loss.
The article concludes that cache problems must be analyzed per business scenario, and the presented techniques—empty‑object caching, Bloom filters, locking, and staggered TTLs—provide a solid foundation for handling high‑concurrency cache challenges.
Full-Stack Internet Architecture
Introducing full-stack Internet architecture technologies centered on Java
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.