Why Your Redis Distributed Lock May Fail and How to Fix It
This article examines common failures of Redis‑based distributed locks, explains the limitations of simple implementations, introduces the Redlock algorithm, and provides practical solutions for high‑concurrency, node failures, TTL overruns, and system clock drift.
Introduction
Redis‑based distributed locks are common, but they can fail. This article shares practical experience and pitfalls.
Do You Really Need a Distributed Lock?
Locks are used when multiple processes access the same resource, typically to improve efficiency or guarantee correctness.
Efficiency – avoid duplicate work; occasional failures are acceptable.
Correctness – failures are not acceptable.
Before adding a lock, consider whether the problem can be solved without one.
Simple Redis Lock Implementation
A classic implementation uses SET with NX and EX options and a Lua script for unlocking.
public static boolean tryLock(String key, String uniqueId, int seconds) {
return "OK".equals(jedis.set(key, uniqueId, "NX", "EX", seconds));
} public static boolean releaseLock(String key, String uniqueId) {
String luaScript = "if redis.call('get', KEYS[1]) == ARGV[1] then " +
"return redis.call('del', KEYS[1]) else return 0 end";
return jedis.eval(luaScript,
Collections.singletonList(key),
Collections.singletonList(uniqueId)).equals(1L);
}Is It Reliable?
The simple lock has two main problems:
Single‑point failure – if the master node crashes after the lock is set, other clients may acquire the same lock.
Lock expiration – if the task runs longer than the TTL, the lock expires and another client may start the same work.
Redlock Algorithm
Redlock mitigates the single‑point issue by requiring locks on a majority of independent Redis masters.
Get the current time.
Try to acquire the lock on N nodes, adjusting each node’s TTL by the time already spent.
Consider the lock acquired if at least N/2 + 1 nodes granted it and all TTLs are positive.
Release the lock on all nodes.
Common Pitfalls
High‑Concurrency Issues
Performance can suffer because acquiring locks from multiple masters is sequential; parallel requests or asynchronous calls can reduce latency. Large locked resources also limit concurrency; splitting resources or using sharding can help.
Retry Storms
Clients may repeatedly collide when retrying; adding a random back‑off mitigates the problem.
Node Failures
If a master crashes after a client has acquired a majority, another client may also succeed, breaking safety. Solutions include persistence, delayed node reintegration, or increasing the number of masters.
Task Longer Than TTL
Network delays or GC pauses can cause tasks to exceed the lock’s TTL, leading to duplicate execution. A watchdog that automatically renews the lock (as in Redisson) is a common remedy.
System Clock Drift
Redis uses wall‑clock time for expirations; clock adjustments or drift can cause premature expiration. Using monotonic clocks or avoiding reliance on absolute time can reduce risk.
Conclusion
The article walks from a basic Redis lock to the Redlock algorithm, enumerates real‑world pitfalls, and offers practical mitigation strategies.
Source: https://www.toutiao.com/i6802512308175110668
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITFLY8 Architecture Home
ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
