Cache Consistency: Pitfalls of Delayed Double Delete and Lease/Versioning Solutions with Redis
This article examines why many large‑scale systems avoid the traditional delayed double‑delete cache‑invalidation strategy, explains its critical drawbacks, and presents alternative lease‑based and version‑based approaches with Lua scripts and Java wrappers for Redis to achieve stronger consistency.
Introduction
Most articles on cache‑database consistency recommend the Cache‑aside pattern with a delete‑on‑write strategy, often combined with a delayed double‑delete to reduce inconsistency windows. However, many core services at large internet companies rarely use this approach, raising questions about its hidden flaws and why alternatives are preferred.
When a primary‑replica database record changes, applications typically delete or update the corresponding cache entry. Even with deletion, brief periods of inconsistency can occur, as illustrated by the accompanying diagram.
Critical Flaws of Delayed Double Delete
The main issue is cache penetration caused by two rapid deletions, which can generate a sudden surge of traffic to the database, overwhelming it for high‑traffic systems. While acceptable for low‑load services, this spike is unacceptable for large‑scale applications.
Facebook (Meta) Solution
Meta’s 2013 paper "Scaling Memcache at Facebook" introduced a lease (lock‑like) mechanism to prevent concurrent writes from causing inconsistency.
When multiple requests miss the cache, the cache returns a 64‑bit token (the lease). The client must present this token on update; the cache validates the token before storing data. Other requests must wait for the lease to expire before acquiring a new one.
The lease mechanism is visualized in the following diagram.
Simple Reference Implementation (Redis, Java)
The implementation focuses on three key aspects:
Override Redis GET to set a lease when the key is missing.
Override Redis SET to verify the lease before writing.
When the database updates, delete both the data key and its lease key.
Lua scripts ensure atomicity for these operations.
Redis GET Operation
local key = KEYS[1]
local token = ARGV[1]
local value = redis.call('get', key)
if not value then
redis.replicate_commands()
local lease_key = 'lease:'..key
redis.call('set', lease_key, token)
return {false, false}
else
return {value, true}
endRedis SET Operation
local key = KEYS[1]
local token = ARGV[1]
local value = ARGV[2]
local lease_key = 'lease:'..key
local lease_value = redis.call('get', lease_key)
if lease_value == token then
redis.replicate_commands()
redis.call('set', key, value)
return {value, true}
else
return {false, false}
endRedis DELETE Operation
local key = KEYS[1]
local token = ARGV[1]
local lease_key = 'lease:'..key
redis.call('del', key, lease_key)Application‑level impacts:
All cache interactions must use EVAL instead of raw Redis commands.
Results are returned as arrays and must be parsed.
Clients must generate and manage lease tokens, handling success/failure based on the returned effect flag.
Java wrapper example:
public class LeaseWrapper extends Jedis implements CacheCommands {
private final Jedis jedis;
private final TokenGenerator tokenGenerator;
private final ThreadLocal
tokenHolder;
public LeaseWrapper(Jedis jedis) {
this.jedis = jedis;
this.tokenHolder = new ThreadLocal<>();
this.tokenGenerator = () -> UUID.randomUUID().toString();
}
@Override
public String get(String key) {
String token = this.tokenGenerator.get();
tokenHolder.set(token);
Object result = this.jedis.eval(LuaScripts.leaseGet(), List.of(key), List.of(token));
EvalResult er = new EvalResult((List
) result);
return er.effect() ? er.value() : null;
}
@Override
public String set(String key, String value) {
String token = tokenHolder.get();
tokenHolder.remove();
Object result = this.jedis.eval(LuaScripts.leaseSet(), List.of(key), List.of(token, value));
EvalResult er = new EvalResult((List
) result);
return er.effect() ? er.value() : null;
}
}Supplement
To prevent other requests from acquiring a lease before the current one expires (mitigating thundering‑herd), the GET script can be extended:
local key = KEYS[1]
local token = ARGV[1]
local value = redis.call('get', key)
if not value then
redis.replicate_commands()
local lease_key = 'lease:'..key
local current_token = redis.call('get', lease_key)
if not current_token or token == current_token then
redis.call('set', lease_key, token)
return {token, false}
else
return {current_token, false}
end
else
return {value, true}
endSetting a short TTL on the lease and optionally spinning with back‑off can further reduce database pressure.
Uber Solution
Uber’s 2023 blog "How Uber Serves Over 40 Million Reads Per Second from Online Storage Using an Integrated Cache" describes a version‑comparison mechanism that writes only newer data to the cache.
The database row’s timestamp is used as a version number. A Lua script executed via EVAL stores both the version key and the data key atomically; on SET, the script compares the incoming version with the stored one and writes only if the incoming version is newer.
The key‑value layout is shown in the diagram.
Simple Reference Implementation (Redis, Java)
Two core steps are required:
Override Redis SET to verify the version before writing.
When the version check passes, store both the version key and the data key together.
Redis SET Operation (Version)
local key = KEYS[1]
local value = ARGV[1]
local current_version = ARGV[2]
local version_key = 'version:'..key
local version_value = redis.call('get', version_key)
if version_value == false or version_value < current_version then
redis.call('mset', version_key, current_version, key, value)
return {value, true}
else
return {false, false}
endApplication code must extract the timestamp from the entity and pass it as the version argument.
VersionWrapper Java Class
public class VersionWrapper extends Jedis implements CacheCommands {
private final Jedis jedis;
public VersionWrapper(Jedis jedis) {
this.jedis = jedis;
}
@Override
public String set(String key, String value, String version) {
Object result = this.jedis.eval(LuaScripts.versionSet(), List.of(key), List.of(value, version));
EvalResult er = new EvalResult((List
) result);
return er.effect() ? er.value() : null;
}
}Supplement
Uber likely uses an update‑cache strategy rather than delete‑then‑write, avoiding the stale‑data window illustrated in the earlier diagram. Their asynchronous Flux component introduces a second‑level delay, which is acceptable for most workloads but still far shorter than the seconds‑level pause that would cause noticeable latency.
Conclusion
The delayed double‑delete is not universally bad, but its drawbacks become pronounced in high‑traffic, large‑scale services where the temporary surge of database reads can be catastrophic. Teams must evaluate their traffic patterns, infrastructure maturity, and operational budgets to choose the most suitable cache‑consistency strategy.
Rare Earth Juejin Tech Community
Juejin, a tech community that helps developers grow.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.