Backend Development 18 min read

Why Your Redis Distributed Lock May Fail and How to Fix It

This article examines common failure scenarios of Redis‑based distributed locks, compares a simple lock implementation with the Redlock algorithm, and provides practical solutions for single‑point failures, lock expiration issues, clock drift, and high‑concurrency pitfalls.

dbaplus Community

Apr 27, 2020

Why Your Redis Distributed Lock May Fail and How to Fix It

Redis‑based distributed locks are widely used, but they can fail in subtle ways. The article first asks whether a distributed lock is really needed, then outlines two typical use cases: improving efficiency by avoiding duplicate work and guaranteeing correctness where duplicate execution is unacceptable.

Simple Redis Lock Implementation

A basic lock uses SET key value NX EX seconds to acquire a lock and a Lua script to release it only if the stored unique ID matches:

public static boolean tryLock(String key, String uniqueId, int seconds) {
    return "OK".equals(jedis.set(key, uniqueId, "NX", "EX", seconds));
}

public static boolean releaseLock(String key, String uniqueId) {
    String luaScript = "if redis.call('get', KEYS[1]) == ARGV[1] then " +
                       "return redis.call('del', KEYS[1]) else return 0 end";
    return jedis.eval(luaScript, Collections.singletonList(key), Collections.singletonList(uniqueId)).equals(1L);
}

The key points are using a unique identifier for each lock and setting an expiration to avoid permanent locks.

Limitations of the Simple Lock

Single‑point failure: If the master Redis node crashes after the lock is set but before replication, multiple clients may acquire the same lock.

Expiration race: If the task exceeds the lock’s TTL (due to GC pauses, network latency, etc.), the lock expires while the client is still working, allowing another client to proceed and causing duplicate processing.

Redlock Algorithm

Redlock mitigates the single‑point issue by requiring a majority of independent Redis masters (N > 2). The algorithm proceeds as follows:

Record the current time.

Attempt to acquire the lock on each of the N nodes, adjusting each node’s TTL by the time already spent.

If the client obtains locks on at least N/2 + 1 nodes and all remaining TTLs are positive, the lock is considered acquired; otherwise, all acquired locks are released.

Release the lock on all nodes when done.

If the adjusted TTL becomes ≤ 0 at any step, the acquisition fails.

Practical Pitfalls in High‑Concurrency Scenarios

Performance overhead: Acquiring locks sequentially on many masters adds latency; parallel requests can reduce this, but the total lock‑acquisition time must still be less than the task’s TTL.

Resource granularity: Large locked resources reduce concurrency. Splitting resources (e.g., per‑merchant or bucketed processing) can improve throughput.

Retry storms: Simultaneous retries can cause many clients to contend for the same locks. Adding random jitter to retry intervals helps mitigate this.

Node crashes: If a master fails after a client has acquired a majority, the lock may still be considered held, but subsequent failures can break safety. Adding more masters improves resilience at the cost of higher expense.

Clock drift: Redis uses the system’s realtime clock for expirations. Large clock drift or manual time changes can cause premature expiration. Using monotonic clocks would be safer, but Redis currently relies on realtime.

Renewal (Watchdog) Mechanisms

Redisson implements an automatic renewal: after a lock is acquired, a timer (default 30 s TTL, renewed every 10 s) extends the lock’s expiration as long as the client remains alive. The renewal logic runs in scheduleExpirationRenewal and repeatedly executes a Lua script that updates the TTL.

If renewal fails (e.g., due to GC pauses or network loss), multiple clients may hold the lock simultaneously. The article suggests using a fencing token (monotonically increasing per‑resource identifier) to reject stale writes, but notes drawbacks such as lack of atomicity and reduced concurrency.

Summary

The piece walks from a basic Redis lock to the more robust Redlock algorithm, highlighting real‑world pitfalls—single‑point failures, lock expiration races, clock drift, and high‑concurrency contention—and offers concrete mitigation strategies like multi‑master quorum, lock granularity, retry jitter, watchdog renewal, and optional fencing tokens.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Java concurrency Redis Distributed Lock Redlock Lock Pitfalls

Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.