Why Redis Distributed Locks Fail and How to Fix Leak and Mis‑release Issues

A Redis lock timeout caused users to be blocked for two hours, revealing problems with lock leakage and mis‑release; the article explains the root causes, proper use of SETNX with EXPIRE, and Lua scripts for atomic operations and safe lock deletion.

ITPUB
ITPUB
ITPUB
Why Redis Distributed Locks Fail and How to Fix Leak and Mis‑release Issues

Background

In a production environment a distributed lock implemented with Redis caused users to be unable to place orders for up to two hours because the SETNX command succeeded but the lock was not released after a timeout, leading to severe customer complaints.

Distributed lock basics

The lock acquisition uses SETNX, which returns false if the key already exists and true otherwise, guaranteeing that only one client can acquire the lock at a time. Releasing the lock is simply DEL key.

Lock‑leak problem

If a lock is created without an expiration, it may never be released. The correct way to set a timeout is with the EXPIRE command, but Redis does not provide an atomic SETNX + EXPIRE. The article recommends using a Lua script to perform both steps atomically:

if (redis.call('setnx',KEYS[1],ARGV[1]) < 1) then
    return 0;
end;
redis.call('EXPIRE',KEYS[1],tonumber(ARGV[2]));
return 1;

Newer Redis versions (≥2.6.12) support the combined syntax SET key value NX EX 10, which eliminates the need for a custom script.

Lock‑mis‑release problem

When a client’s lock expires, another client may acquire the same key. If the original client later calls DEL, it unintentionally removes the new client’s lock. To avoid this, the lock value should contain a unique identifier (e.g., UUID, timestamp, or IP+thread). Deletion must verify that the stored value matches the client’s identifier before deleting – a compare‑and‑delete (CAD) operation.

The required CAD logic can also be implemented with Lua:

if (redis.call('GET',KEYS[1]) ~= false) then
    local v = redis.call('GET', KEYS[1]);
    if (v ~= KEYS[2]) then
        return -1;
    end;
    local res = redis.call('DEL', KEYS[1]);
    if (res == 1) then
        return 1;
    else
        return -2;
    end;
end;
return 0;

Load the script with redis-cli script load "$(cat cad.lua)" and invoke it via EVALSHA <sha1> 2 KEY VALUE. The return codes are:

1 – value matches, delete succeeded

0 – key does not exist

-1 – value does not match

-2 – value matches but delete failed (theoretically impossible after the prior check)

Why the lock value matters

Storing a unique value allows a client, after a timeout, to query the lock with GET. If the key exists, the client must compare the stored value with its own identifier to determine whether it actually holds the lock or if another client has taken it.

Conclusion

Both lock acquisition and release should include a unique value to prevent leak and mis‑release issues. After a timeout, a client should GET the lock and verify the value before deciding to retry or consider the lock free. Using Lua scripts (or the newer SET ... NX EX syntax) ensures atomicity and reliability of Redis‑based distributed locks.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

RedisDistributed LockLuaatomicsetnxExpire
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.