Redis Distributed Lock Failure Analysis and Safer Lock Solutions for High‑Concurrency Seckill
This article analyzes a real‑world overselling incident caused by an unsafe Redis distributed lock in a high‑traffic flash‑sale service, explains the root causes, and presents safer lock implementations, atomic stock checks, and architectural improvements to prevent similar failures.
Introduction
Using Redis for distributed locks is common, but this article examines a real incident where the lock caused an overselling disaster during a limited‑stock flash‑sale of a rare product.
Incident Scene
The flash‑sale of 100 bottles of a scarce liquor resulted in overselling, triggering a P0‑level incident and performance penalties for the whole team.
public SeckillActivityRequestVO seckillHandle(SeckillActivityRequestVO request) {
SeckillActivityRequestVO response;
String key = "key:" + request.getSeckillId;
try {
Boolean lockFlag = redisTemplate.opsForValue().setIfAbsent(key, "val", 10, TimeUnit.SECONDS);
if (lockFlag) {
// user validation, activity validation
// stock validation
Object stock = redisTemplate.opsForHash().get(key+":info", "stock");
assert stock != null;
if (Integer.parseInt(stock.toString()) <= 0) {
// business exception
} else {
redisTemplate.opsForHash().increment(key+":info", "stock", -1);
// generate order, publish success event, build response VO
}
}
} finally {
// release lock
stringRedisTemplate.delete("key");
}
return response;
}The code sets a 10‑second lock, checks stock, and releases the lock in a finally block, appearing safe at first glance.
Root Causes
1. No system‑level fault tolerance : User‑service overload caused gateway latency, leading to request timeouts that let the lock expire while business logic was still running.
2. Distributed lock is not truly safe : If thread A holds the lock longer than its TTL, thread B can acquire the lock after expiration; when A finally releases the lock, it unintentionally deletes B’s lock, allowing further unsafe access.
3. Non‑atomic stock check : The get‑and‑compare pattern is not atomic under high concurrency, so multiple threads can see sufficient stock and decrement it simultaneously, causing oversell.
Analysis
The combination of an overloaded user service, lock expiration, and non‑atomic stock verification creates a vicious cycle that leads to duplicate order creation.
Solutions
Implement a Safer Distributed Lock
Use a unique value for the lock and release it only when the stored value matches, typically via a Lua script to guarantee atomicity.
public void safedUnLock(String key, String val) {
String luaScript = "local in = ARGV[1] local curr=redis.call('get', KEYS[1]) if in==curr then redis.call('del', KEYS[1]) end return 'OK'";
RedisScript
redisScript = RedisScript.of(luaScript);
redisTemplate.execute(redisScript, Collections.singletonList(key), Collections.singleton(val));
}Implement Atomic Stock Check
Leverage Redis's atomic increment operation to decrement stock safely without a separate lock.
// Redis returns the result after the operation atomically
Long currStock = redisTemplate.opsForHash().increment("key", "stock", -1);Refactored Business Logic
Introduce a dedicated DistributedLocker class and use the safe lock/unlock methods together with atomic stock decrement.
public SeckillActivityRequestVO seckillHandle(SeckillActivityRequestVO request) {
SeckillActivityRequestVO response;
String key = "key:" + request.getSeckillId();
String val = UUID.randomUUID().toString();
try {
Boolean lockFlag = distributedLocker.lock(key, val, 10, TimeUnit.SECONDS);
if (!lockFlag) {
// business exception
}
// user & activity validation omitted for brevity
Long currStock = stringRedisTemplate.opsForHash().increment(key+":info", "stock", -1);
if (currStock < 0) {
log.error("[Seckill] No stock");
// business exception
} else {
// generate order, publish event, build response
}
} finally {
distributedLocker.safedUnLock(key, val);
}
return response;
}Deep Reflections
Is a Distributed Lock Necessary?
Even with Redis's atomic stock decrement, a lock can reduce pressure on downstream services by short‑circuiting requests that would otherwise perform full business logic.
Lock Selection
RedLock offers higher reliability at the cost of performance; for this scenario, the simpler lock with safe release is more cost‑effective.
Further Optimizations
By sharding stock across cluster nodes and routing requests based on user‑ID hashing, the system can avoid Redis entirely, using in‑memory structures like ConcurrentHashMap and AtomicInteger for ultra‑low latency.
// Example of in‑memory stock handling
private static ConcurrentHashMap
SECKILL_FLAG_MAP = new ConcurrentHashMap<>();
private static Map
SECKILL_STOCK_MAP = new HashMap<>();
public SeckillActivityRequestVO seckillHandle(SeckillActivityRequestVO request) {
Long seckillId = request.getSeckillId();
if (!SECKILL_FLAG_MAP.get(seckillId)) {
// business exception
}
if (SECKILL_STOCK_MAP.get(seckillId).decrementAndGet() < 0) {
SECKILL_FLAG_MAP.put(seckillId, false);
// business exception
}
// generate order, publish event, build response
return response;
}Conclusion
Overselling scarce items can cause severe business and reputational damage. This case demonstrates that even seemingly correct lock code can become a fatal flaw under high concurrency, and that careful design, atomic operations, and proper fault tolerance are essential for reliable systems.
Continuous learning and thorough architectural review are the only ways to avoid such hidden pitfalls.
Java Captain
Focused on Java technologies: SSM, the Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading; occasionally covers DevOps tools like Jenkins, Nexus, Docker, ELK; shares practical tech insights and is dedicated to full‑stack Java development.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.