Backend Development 34 min read

Is Redis Distributed Lock Really Safe? A Deep Dive into Redlock, Pitfalls, and Alternatives

This article thoroughly examines the safety of Redis‑based distributed locks, explains basic SETNX locking, explores deadlock and lock‑release problems, presents robust solutions such as atomic SET with expiration, Lua scripts, and unique tokens, and critically compares Redlock with Zookeeper while summarizing expert debates and best‑practice recommendations.

ITPUB

Jun 29, 2021

Is Redis Distributed Lock Really Safe? A Deep Dive into Redlock, Pitfalls, and Alternatives

Why a Distributed Lock Is Needed

When multiple processes need to modify a shared resource—such as a row in MySQL—across different machines, a single‑process mutex is insufficient, so a distributed lock provided by an external system like Redis or Zookeeper is required.

Basic Redis Lock Using SETNX

Two clients attempt to acquire a lock with SETNX lock 1. The first client succeeds, the second fails. The holder performs its critical section and releases the lock with DEL lock.

127.0.0.1:6379> SETNX lock 1
(integer) 1   // client 1 acquires lock

127.0.0.1:6379> SETNX lock 1
(integer) 0   // client 2 fails

After the work is done, the lock is released:

127.0.0.1:6379> DEL lock   // release lock
(integer) 1

Problems with the Simple Approach

If the client crashes or the process exits before releasing the lock, the lock remains held (deadlock).

If the client holds the lock longer than the intended timeout, other clients may acquire a stale lock.

Adding an Expiration Time

Set an expiration when acquiring the lock:

127.0.0.1:6379> SETNX lock 1
(integer) 1
127.0.0.1:6379> EXPIRE lock 10   // auto‑expire after 10 s
(integer) 1

This mitigates deadlock but introduces a race: SETNX and EXPIRE are two separate commands, so the expiration may fail.

Atomic Lock with SET Options (Redis 2.6.12+)

127.0.0.1:6379> SET lock 1 EX 10 NX
OK

The single command guarantees that the key is created only if it does not exist and that the expiration is set atomically.

Ensuring the Correct Owner Releases the Lock

Store a unique identifier (e.g., a UUID) as the lock value:

# lock value is a UUID
127.0.0.1:6379> SET lock $uuid EX 20 NX
OK

When releasing, verify ownership first. The naive GET + DEL is still non‑atomic, so a Lua script is used:

if redis.call("GET", KEYS[1]) == ARGV[1] then
    return redis.call("DEL", KEYS[1])
else
    return 0
end

Redlock – Multi‑Instance Locking

Redlock requires at least five independent Redis master instances. The client performs the following steps:

Record the current timestamp T1.

Send a SET key value EX ttl NX request to each instance, aborting on failure and moving to the next.

If locks are acquired on a majority (≥3), record timestamp T2 and ensure T2‑T1 < ttl. If the check fails, release all locks.

Proceed with the critical section.

If any step fails, release locks on all instances.

Criticism of Redlock (Martin’s View)

Distributed systems suffer from Network Delay, Process Pause (GC), and Clock Drift (NPC). These can cause the lock to expire while a client is still working, leading to two clients believing they hold the lock.

Redlock assumes synchronized clocks across nodes, which is unrealistic in practice.

Without a fencing token, Redlock cannot guarantee correctness; a monotonic token (as proposed by Martin) would be needed.

Antirez’s Rebuttal

Clock synchronization only needs to be approximate; small drift is acceptable.

Step 3 of Redlock detects excessive latency (including GC) before the lock is considered successful, preventing the described conflict.

Issues that occur after a client has confirmed the lock (e.g., long GC) affect any lock service, not just Redlock.

Zookeeper‑Based Lock

Clients create an ephemeral node (e.g., /lock). The first client to create the node acquires the lock; the node is automatically removed if the client’s session expires (e.g., due to missed heartbeats).

However, if a client experiences a long GC pause, its heartbeat stops, Zookeeper deletes the node, and another client may acquire the lock while the first client still believes it holds it—mirroring the same safety gap as Redis.

Practical Recommendations

Use simple Redis SET NX EX with a unique value and a Lua script for safe release when high performance is needed and the clock can be trusted.

Prefer Redlock only when you can guarantee reasonably synchronized clocks and can afford the overhead of multiple instances.

For strict correctness, combine a distributed lock with a fencing token or rely on a consensus system such as Zookeeper or etcd that provides stronger guarantees.

Always design the underlying resource operations (e.g., database updates) to be idempotent or to verify ownership, reducing the impact of occasional lock failures.

Conclusion

Redis distributed locks are convenient but have edge‑case safety issues. Redlock mitigates many problems but still depends on clock accuracy and cannot fully replace a true consensus‑based lock. Zookeeper avoids expiration but suffers from session‑heartbeat failures. Understanding these trade‑offs helps you choose the right locking strategy for your backend system.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Redis fault tolerance Lua Redlock

Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.