Fundamentals 22 min read

Why Redlock May Not Be the Ultimate Distributed Lock (And What to Use Instead)

This article reviews the evolution of distributed locking—from simple MySQL table locks to Redis cache locks and the Redlock algorithm—examines expert criticisms of Redlock’s correctness, presents the Redis author’s rebuttal, and ultimately recommends Zookeeper as a more reliable solution for high‑availability distributed locks.

ITFLY8 Architecture Home

Nov 24, 2016

Why Redlock May Not Be the Ultimate Distributed Lock (And What to Use Instead)

Origin

Recently I read a Redis author’s article Is Redlock safe? which responded to a distributed‑systems expert’s critique titled How to do distributed locking . The two articles debate the correctness of Redlock, and this post analyzes their arguments.

Database Lock Table

My first experience with a distributed lock was using a MySQL table. The table schema was:

CREATE TABLE `lockedOrder` (
  `id` int(11) NOT NULL AUTO_INCREMENT COMMENT 'primary key',
  `type` tinyint(8) unsigned NOT NULL DEFAULT '0' COMMENT 'operation type',
  `order_id` varchar(64) NOT NULL DEFAULT '' COMMENT 'locked order id',
  `memo` varchar(1024) NOT NULL DEFAULT '',
  `update_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT 'record time',
  PRIMARY KEY (`id`),
  UNIQUE KEY `uidx_order_id` (`order_id`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='locked orders';

The UNIQUE KEY on order_id guarantees that only one transaction can insert a given order ID, turning the database into a simple lock manager. Pseudo‑code for lock and unlock:

def lock:
    exec sql: insert into lockedOrder(type,order_id,memo) values (type,order_id,memo)
    if result == true:
        return true
    else:
        return false

def unlock:
    exec sql: delete from lockedOrder where order_id='order_id'

Issues with this approach:

It is a non‑blocking try‑lock; implementing a blocking lock would require repeated inserts.

No expiration – a crashed service leaves the lock forever unless a cleanup job deletes stale rows.

Not re‑entrant – the same client cannot acquire the lock again without additional logic.

Cache Lock

Using a cache service such as Redis for locks offers far higher performance (up to 100 k operations per second with sub‑millisecond latency). Redis implements a lock with the SETNX command, which succeeds only if the key does not already exist.

Since Redis 2.6.12, the SET command can atomically set both NX and an expiration ( EX), eliminating the need for a separate cleanup task.

The drawback is that if the Redis node crashes, the lock disappears. Replication helps but is asynchronous, so a master failure before replication can still cause lock loss.

Distributed Cache Lock – Redlock

To mitigate single‑node failure, the Redis author proposed the Redlock algorithm, which runs on N independent Redis nodes (commonly N=5). The algorithm works as follows:

Client records the current time in milliseconds.

Client attempts to acquire the same key/value on all N nodes, setting a short network timeout (e.g., 5‑50 ms) much smaller than the lock’s TTL (e.g., 10 s).

If the client obtains locks on at least three nodes and the total acquisition time is less than the lock’s TTL, the lock is considered acquired.

The effective lock TTL is the original TTL minus the time spent acquiring the lock.

If acquisition fails, the client releases any partial locks.

Redlock can tolerate up to two node failures while still providing a functional lock.

Expert Criticism of Redlock

The expert argued that a correct distributed lock must consider both performance and correctness. He highlighted two major problems:

Garbage‑collection pauses (or other long pauses) can cause a client to exceed the lock’s TTL, releasing the lock while still processing, leading to concurrent processing of the same resource.

Redlock relies on each node’s local clock; clock drift can cause two clients to believe they both hold the lock.

He illustrated the clock‑drift scenario with diagrams (omitted here) and suggested using a monotonically increasing token (MVCC) to detect stale locks.

Redis Author’s Rebuttal

I asked for an analysis in the original Redlock specification here: http://redis.io/topics/distlock. So thank you Martin. However I don’t agree with the analysis.

The author summarized his response in five points:

Distributed locks are a last‑resort mechanism; if you can use a token, you may not need a lock.

Generating a reliable token itself may require a lock.

Instead of an incrementing token, a UUID can serve as a unique identifier.

Ordered tokens do not solve the GC‑induced timeout problem.

In most cases, locks are used for non‑transactional updates where token‑based solutions are impractical.

He also explained that the client’s effective lock time is the TTL minus the acquisition time, so a client that spends too long acquiring the lock will simply fail to obtain it.

Further Analysis of Redlock

Both sides agree that Redlock improves reliability compared to a single Redis instance, but it incurs higher latency (multiple network round‑trips), requires at least five nodes, and suffers under network partitions or multiple node failures.

To achieve stronger correctness guarantees, the author suggests using a strongly consistent coordination service such as Zookeeper.

Better Distributed Lock – Zookeeper

Zookeeper implements a Paxos‑like consensus protocol. Write requests go to a leader, which replicates to followers before acknowledging success.

Key features that make Zookeeper suitable for locks:

Watcher mechanism enables true blocking locks: a client watches the lock node and is notified when it is released.

Ephemeral nodes are automatically removed if the client session expires, providing automatic lock release without explicit TTL.

Typical lock acquisition: clients attempt to create a node (e.g., /path/lock). The first succeeds and holds the lock; others fail and set a watch on the node. When the lock node is deleted, the watch triggers and the next client can try again.

For Java applications, the Curator library (specifically org.apache.curator.framework.recipes.locks) offers a high‑level API for Zookeeper locks.

Conclusion

This article reviewed several distributed‑lock implementations—database locks, Redis cache locks, Redlock, and Zookeeper. While Redlock addresses some reliability concerns of single‑node Redis, it still has correctness limitations and operational overhead. For scenarios demanding strong correctness, Zookeeper (or similar coordination services) provides a more robust solution.

References:

Distributed locks with Redis

Is Redlock safe?

How to do distributed locking

Follow‑along tutorial for Zookeeper locks

From Paxos to Zookeeper: Distributed Consistency Principles and Practice

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Cache Redis Zookeeper Distributed Lock Consensus Redlock

Written by

ITFLY8 Architecture Home

ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.