Backend Development 40 min read

Is Redis Distributed Lock Safe? Deep Dive into Redlock and Zookeeper Pitfalls

This article thoroughly explains why distributed locks are needed, walks through basic Redis lock implementations, exposes deadlock and expiration issues, presents robust solutions with unique IDs and Lua scripts, examines the Redlock algorithm, reviews the Martin‑Antirez debate, and compares Redis with Zookeeper locks.

dbaplus Community

Jul 25, 2021

Is Redis Distributed Lock Safe? Deep Dive into Redlock and Zookeeper Pitfalls

Why a Distributed Lock?

When multiple processes need to modify a shared resource—such as a MySQL row in a micro‑service architecture—simple in‑process mutexes are insufficient, so an external system that can provide mutual exclusion across processes is required. Redis or Zookeeper are common choices because they can be accessed by all processes.

Basic Redis Lock Using SETNX

The simplest lock uses the SETNX command, which sets a key only if it does not already exist.

127.0.0.1:6379> SETNX lock 1
(integer) 1   // client 1 acquires the lock

If another client tries the same command it fails:

127.0.0.1:6379> SETNX lock 1
(integer) 0   // client 2 fails to acquire the lock

After the critical section the lock is released with DEL:

127.0.0.1:6379> DEL lock // release lock
(integer) 1

Problems with the Simple Approach

Deadlock if the client crashes or forgets to release the lock.

Lock expiration may be set incorrectly, causing premature release.

Two‑step SETNX + EXPIRE is not atomic; a failure between the commands leaves the lock without an expiration.

Adding an Expiration (Atomic with SET)

Redis 2.6.12 introduced the extended SET syntax that combines setting the value, expiration, and the NX flag in a single atomic operation:

127.0.0.1:6379> SET lock 1 EX 10 NX
OK

This eliminates the race between SETNX and EXPIRE, but still suffers from lock‑expiration mis‑estimation.

Ensuring the Lock Is Owned by the Client

Store a unique identifier (e.g., a UUID) as the lock value:

// lock value is a UUID
127.0.0.1:6379> SET lock $uuid EX 20 NX
OK

When releasing, verify ownership first:

if redis.get("lock") == $uuid:
    redis.del("lock")

Because the check and delete are two separate commands, they must be executed atomically. A Lua script solves this:

if redis.call("GET", KEYS[1]) == ARGV[1] then
    return redis.call("DEL", KEYS[1])
else
    return 0
end

Redlock Algorithm (Multiple Redis Instances)

Redlock attempts to provide fault tolerance by acquiring locks on a majority of independent Redis masters (recommended at least five). The steps are:

Record start timestamp T1.

Send SET key value EX ttl NX to each instance, with a short network timeout.

If locks are obtained on ≥3 instances, record timestamp T2 and verify T2‑T1 < ttl. If the check fails, release all locks.

Perform the critical operation.

If acquisition failed, release any partial locks.

The algorithm relies on majority consensus and on the total acquisition time being less than the lock’s TTL.

Debate: Is Redlock Really Safe?

Distributed‑systems expert Martin (Cambridge) argued that:

Locks are mainly for efficiency; for correctness a single‑node Redis is enough, and Redlock adds unnecessary complexity.

Redlock cannot guarantee safety because it assumes synchronized clocks (the “C” in NPC) and can fail under network delays, process pauses (GC), or clock drift.

He proposed a “fencing token” approach where a monotonically increasing token is stored with the resource, allowing the resource itself to reject stale operations.

Redis creator Antirez responded that:

Only coarse clock synchronization is needed; small drift is acceptable.

The third step of Redlock (checking T2‑T1) detects excessive delays before the lock is considered valid.

After a lock is acquired, any NPC (network delay, GC) that occurs during the critical section is a problem for any lock service, not just Redlock.

He questioned the practicality of fencing tokens, noting that many resources (e.g., plain HTTP calls) cannot enforce them.

Zookeeper Locks

Zookeeper implements locks via temporary znodes. A client creates an EPHEMERAL node; if it succeeds it holds the lock, otherwise it watches the node and retries. The session is kept alive by periodic heartbeats. If the client crashes or its heartbeat stops, the node is automatically removed.

However, Zookeeper also suffers from the same failure mode: a long GC pause can stop heartbeats, causing the lock to be released while the client still believes it holds it.

Comparison and Practical Guidance

Redis (single node) : simple, fast, but vulnerable to deadlock and expiration mis‑estimation.

Redis with SET+EX : atomic acquisition, still needs careful TTL sizing.

Redlock : provides fault tolerance across multiple masters, but requires clock accuracy and majority availability.

Zookeeper : no explicit TTL, uses session heartbeats, but performance is lower and operational cost higher.

In extreme cases no distributed lock can guarantee 100 % safety; additional application‑level safeguards (e.g., fencing tokens or idempotent operations) are recommended for critical data.

Author’s Takeaways

Prefer simple Redis locks with unique IDs and Lua‑based release for most use‑cases.

Use Redlock only when you can ensure synchronized clocks and need high availability across multiple nodes.

For absolute correctness, combine a lock with a fencing‑token‑like mechanism at the resource level.

Overall, understanding the failure scenarios—network delays, process pauses, clock drift, and master failover—is essential when choosing a distributed‑lock strategy.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Redis Zookeeper Lua Redlock

Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.