How Redis Powers Distributed Locks: Design, Pitfalls, and Solutions

This article explores the design and implementation of a Redis‑based distributed lock used since 2013, analyzes lock acquisition and release mechanisms, discusses common pitfalls such as expiration handling and race conditions, and presents possible improvements like Lua scripts and Redisson.

Java Backend Technology
Java Backend Technology
Java Backend Technology
How Redis Powers Distributed Locks: Design, Pitfalls, and Solutions

Preface

When you hear terms like data consistency and atomic operations, what comes to mind first? Most people think of "locks"—and for good reason. In monolithic applications we could rely on JVM‑provided locks, but in distributed systems those locks no longer work, so we must use cross‑JVM resources such as Redis or Zookeeper to provide lock semantics.

Getting to the Point

I'll share the Redis‑based distributed lock our company has been using since 2013. It has served us well, but it also has design flaws that I will discuss. I hope readers will engage in a constructive exchange and remember that the best solution is the one that fits your own needs.

Lock Acquisition Analysis

Q1: Why not use the SET key value [EX seconds|PX milliseconds] [NX|XX] command to let the key expire automatically, and instead check expiration in application code?

A1: When we first implemented the lock, the SET command did not support the NX and PX options, so we devised a manual expiration check. Support for NX and PX was added after Redis 2.6.12.

Q2: After confirming that the timestamp stored in the key has expired, why do we still call GETSET instead of simply overwriting with SET?

A2: This relates to concurrency. If we used SET directly, multiple clients could acquire the lock simultaneously. By using GETSET and checking the old value's expiration, we avoid that race condition. For example, if client C1 acquires the lock and crashes, its lock is not released. Clients C2 and C3 see the expired key; if they both use SET, both would acquire the lock, causing a serious problem. With GETSET, the faster client wins while the slower one sees the updated timestamp and fails to acquire the lock.

Lock Release Analysis

Q1: Why do we need to check whether the key has expired before releasing the lock, instead of simply calling DEL for better performance?

A1: Consider this scenario: C1 acquires the lock and then gets blocked; meanwhile C2 acquires the lock after the key expires and succeeds via GETSET. When C1 finally wakes up, it attempts to release the lock with DEL, unintentionally deleting C2's lock. Then C3 acquires the lock and proceeds, causing errors. To avoid this, the release process must (1) check whether the stored value has already expired and (2) only delete the key if it has not expired. Even this two‑step approach can still be unsafe if a long gap occurs between the check and the delete.

Is there a better solution? Using a Lua script can make the check‑and‑delete atomic.

Facing the Shortcomings

Q1: How to renew a Redis lock when its expiration time is shorter than the business operation?

A1: This feature is not yet implemented in our lock. The Redisson library provides such functionality, and we are evaluating its adoption.

Q2: How is high availability achieved?

A2: We use a failover mechanism with a Redis connection pool. When acquiring or releasing a lock, we write to multiple Redis nodes to ensure consistency. If a node becomes unavailable, the client switches to another node. However, because the multi‑write is not atomic, it is possible for two clients to acquire the lock simultaneously under certain network partitions. Some services mitigate this by using a unique database index; we plan to fix this bug in the future.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Javaconcurrencyredisdistributed-lock
Java Backend Technology
Written by

Java Backend Technology

Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.