Why a Simple Redis Mutex Lock Isn’t Enough for Cache Breakdown – When to Use Never‑Expire or Logical Expiration
The article analyzes why a basic Redis mutex lock can cause thread blocking, latency spikes, and service collapse under high concurrency, and compares it with logical expiration and never‑expire with proactive updates, explaining their trade‑offs and how to choose the right cache‑breakdown mitigation strategy.
Mutex lock implementation and its problems
A typical Redis‑based mutex lock for cache breakdown:
public Object getData(String key) {
Object value = redis.get(key);
if (value != null) {
return value;
}
// cache miss, try to acquire lock
String lockKey = "lock:" + key;
boolean locked = redis.opsForValue()
.setIfAbsent(lockKey, "1", 10, TimeUnit.SECONDS);
if (locked) {
try {
// only the lock holder queries the DB and rebuilds the cache
value = db.query(key);
redis.opsForValue().set(key, value, 30, TimeUnit.MINUTES);
} finally {
redis.delete(lockKey);
}
} else {
// other threads sleep briefly and retry
Thread.sleep(50);
return getData(key);
}
return value;
}When a hot key expires under high concurrency (e.g., 5,000 QPS), only one request obtains the lock while the rest sleep and retry. In a typical Spring MVC + Tomcat deployment the default thread pool is 200, so those threads become occupied, leaving no threads for other endpoints and causing time‑outs.
The approach assumes cache rebuilding takes only a few tens of milliseconds. In reality a product‑detail cache may require 5–6 DB queries or cross‑service calls, taking 500 ms – 2 s. During this window each waiting thread retries every 50 ms, resulting in up to 40 retries per request and additional load on Redis. If the DB is already under pressure, the waiting queue grows, the thread pool is exhausted, upstream calls time out, and retry storms can trigger a cascade failure.
Using Redis SETNX for a distributed lock adds further risk: if the node holding the lock crashes before releasing it, requests wait until the lock expires; if the lock timeout is shorter than the rebuild time, the lock may be released early and another node starts rebuilding, causing duplicate work. A full‑featured solution such as Redisson’s watchdog can mitigate these issues but introduces the complexity of a distributed‑lock system just to solve cache breakdown.
Logical expiration (stale‑while‑revalidate)
Instead of a physical TTL, store a logical expiration timestamp inside the cached value:
@Data
public class CacheData {
private Object data;
private long expireTime; // logical expiration timestamp
}Reading the cache checks the logical timestamp. If it has not expired, the fresh data is returned. If it has expired, the stale data is returned immediately and an asynchronous task rebuilds the cache.
private static final ExecutorService REBUILD_EXECUTOR = Executors.newFixedThreadPool(10);
public Object getData(String key) {
String json = redis.opsForValue().get(key);
if (json == null) {
// cold start – query DB directly
return rebuildAndReturn(key);
}
CacheData cacheData = JSON.parseObject(json, CacheData.class);
if (cacheData.getExpireTime() > System.currentTimeMillis()) {
return cacheData.getData(); // still fresh
}
// logical expiration – return stale data and trigger async rebuild
String lockKey = "lock:" + key;
boolean locked = redis.opsForValue()
.setIfAbsent(lockKey, "1", 10, TimeUnit.SECONDS);
if (locked) {
REBUILD_EXECUTOR.submit(() -> {
try {
Object newData = db.query(key);
CacheData newCache = new CacheData();
newCache.setData(newData);
newCache.setExpireTime(System.currentTimeMillis() + TimeUnit.MINUTES.toMillis(30));
redis.opsForValue().set(key, JSON.toJSONString(newCache));
} finally {
redis.delete(lockKey);
}
});
}
return cacheData.getData(); // always return stale data regardless of lock acquisition
}All threads return within milliseconds, avoiding thread‑pool exhaustion. The trade‑off is that for the rebuild window (hundreds of ms to a few seconds) some users receive outdated data.
Never‑expire + proactive update
Another strategy is to never set a TTL on the key and rely on explicit update events to keep the cache fresh.
// Update cache when underlying data changes
@EventListener
public void onProductUpdate(ProductUpdateEvent event) {
Product product = event.getProduct();
redis.opsForValue().set("product:" + product.getId(), JSON.toJSONString(product));
}As a fallback, a periodic task can refresh hot items:
@Scheduled(fixedRate = 60000)
public void refreshHotProductCache() {
List<String> hotProductIds = getHotProductIds();
for (String id : hotProductIds) {
Product product = productMapper.selectById(id);
redis.opsForValue().set("product:" + id, JSON.toJSONString(product));
}
}Because the key never expires, cache breakdown cannot occur. This approach fits data that changes infrequently and has clear update triggers (e.g., configuration, category trees, city lists). The downside is that if the update path fails, the cache can become permanently stale; a full‑refresh fallback task is therefore recommended.
How to choose a solution
Mutex lock : Use when absolute data correctness is required and returning stale data is unacceptable (e.g., account balance, inventory, payment status). The lock guarantees a single rebuild but can cause thread pile‑up and service latency under high concurrency.
Logical expiration : Suitable when a few seconds of staleness is tolerable but low latency is critical (e.g., product detail pages, search results, recommendation feeds). Threads always return immediately; stale data is only visible during the rebuild window.
Never‑expire + proactive update : Ideal for rarely changing data with explicit update moments (e.g., system configuration, category hierarchy, data dictionaries). The cache never expires, eliminating breakdown risk, but requires reliable update mechanisms and possibly a periodic full‑refresh safeguard.
In practice systems often combine these patterns: configuration data uses never‑expire, product details use logical expiration, and inventory or financial data uses a mutex lock.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Programmer XiaoFu
xiaofucode.com – a programmer learning guide driven by the pursuit of profit
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
