Is Redis Distributed Lock Really Safe? A Deep Dive into Redlock, Pitfalls, and Alternatives
This article thoroughly examines the safety of Redis‑based distributed locks, explains basic SETNX locking, explores deadlock and lock‑release problems, presents robust solutions such as atomic SET with expiration, Lua scripts, and unique tokens, and critically compares Redlock with Zookeeper while summarizing expert debates and best‑practice recommendations.
Why a Distributed Lock Is Needed
When multiple processes need to modify a shared resource—such as a row in MySQL—across different machines, a single‑process mutex is insufficient, so a distributed lock provided by an external system like Redis or Zookeeper is required.
Basic Redis Lock Using SETNX
Two clients attempt to acquire a lock with SETNX lock 1. The first client succeeds, the second fails. The holder performs its critical section and releases the lock with DEL lock.
127.0.0.1:6379> SETNX lock 1
(integer) 1 // client 1 acquires lock
127.0.0.1:6379> SETNX lock 1
(integer) 0 // client 2 failsAfter the work is done, the lock is released:
127.0.0.1:6379> DEL lock // release lock
(integer) 1Problems with the Simple Approach
If the client crashes or the process exits before releasing the lock, the lock remains held (deadlock).
If the client holds the lock longer than the intended timeout, other clients may acquire a stale lock.
Adding an Expiration Time
Set an expiration when acquiring the lock:
127.0.0.1:6379> SETNX lock 1
(integer) 1
127.0.0.1:6379> EXPIRE lock 10 // auto‑expire after 10 s
(integer) 1This mitigates deadlock but introduces a race: SETNX and EXPIRE are two separate commands, so the expiration may fail.
Atomic Lock with SET Options (Redis 2.6.12+)
127.0.0.1:6379> SET lock 1 EX 10 NX
OKThe single command guarantees that the key is created only if it does not exist and that the expiration is set atomically.
Ensuring the Correct Owner Releases the Lock
Store a unique identifier (e.g., a UUID) as the lock value:
# lock value is a UUID
127.0.0.1:6379> SET lock $uuid EX 20 NX
OKWhen releasing, verify ownership first. The naive GET + DEL is still non‑atomic, so a Lua script is used:
if redis.call("GET", KEYS[1]) == ARGV[1] then
return redis.call("DEL", KEYS[1])
else
return 0
endRedlock – Multi‑Instance Locking
Redlock requires at least five independent Redis master instances. The client performs the following steps:
Record the current timestamp T1.
Send a SET key value EX ttl NX request to each instance, aborting on failure and moving to the next.
If locks are acquired on a majority (≥3), record timestamp T2 and ensure T2‑T1 < ttl. If the check fails, release all locks.
Proceed with the critical section.
If any step fails, release locks on all instances.
Criticism of Redlock (Martin’s View)
Distributed systems suffer from Network Delay, Process Pause (GC), and Clock Drift (NPC). These can cause the lock to expire while a client is still working, leading to two clients believing they hold the lock.
Redlock assumes synchronized clocks across nodes, which is unrealistic in practice.
Without a fencing token, Redlock cannot guarantee correctness; a monotonic token (as proposed by Martin) would be needed.
Antirez’s Rebuttal
Clock synchronization only needs to be approximate; small drift is acceptable.
Step 3 of Redlock detects excessive latency (including GC) before the lock is considered successful, preventing the described conflict.
Issues that occur after a client has confirmed the lock (e.g., long GC) affect any lock service, not just Redlock.
Zookeeper‑Based Lock
Clients create an ephemeral node (e.g., /lock). The first client to create the node acquires the lock; the node is automatically removed if the client’s session expires (e.g., due to missed heartbeats).
However, if a client experiences a long GC pause, its heartbeat stops, Zookeeper deletes the node, and another client may acquire the lock while the first client still believes it holds it—mirroring the same safety gap as Redis.
Practical Recommendations
Use simple Redis SET NX EX with a unique value and a Lua script for safe release when high performance is needed and the clock can be trusted.
Prefer Redlock only when you can guarantee reasonably synchronized clocks and can afford the overhead of multiple instances.
For strict correctness, combine a distributed lock with a fencing token or rely on a consensus system such as Zookeeper or etcd that provides stronger guarantees.
Always design the underlying resource operations (e.g., database updates) to be idempotent or to verify ownership, reducing the impact of occasional lock failures.
Conclusion
Redis distributed locks are convenient but have edge‑case safety issues. Redlock mitigates many problems but still depends on clock accuracy and cannot fully replace a true consensus‑based lock. Zookeeper avoids expiration but suffers from session‑heartbeat failures. Understanding these trade‑offs helps you choose the right locking strategy for your backend system.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
