How to Build a Reliable Redis‑Based Distributed Lock (Lessons from a 2013 Implementation)
This article explains the design, pitfalls, and improvement ideas of a Redis‑based distributed lock used since 2013, covering lock acquisition and release processes, expiration handling, high‑availability strategies, and practical lessons for building robust concurrency control in distributed systems.
Preface
When you hear terms like data consistency, atomicity, and other concurrency‑related words, you probably think of “locks”. In the era of monolithic applications, JVM locks work fine, but in distributed applications JVM locks are ineffective, so we must rely on cross‑JVM resources such as Redis or Zookeeper.
Getting into the Topic
Here I share the Redis‑based distributed lock our company has been using since 2013, its design flaws, and the lessons learned. I hope readers will discuss civilly and remember that the best solution is the one that fits your own scenario.
Lock Acquisition Analysis
Q1: Why not use SET key value [EX seconds|PX milliseconds] [NX|XX] to let the key expire automatically, and instead check expiration in application code?
A1: When we first developed the lock, the SET command did not support NX or PX; support was added after Redis 2.6.12.
Q2: After confirming that the key’s timestamp is expired, why call GETSET again instead of directly using SET to overwrite?
A2: This involves concurrency concerns. If we use plain SET, multiple clients might acquire the lock simultaneously. Using GETSET and checking the old value’s expiration avoids this race. For example:
1. Client C1 acquires the lock but crashes before releasing it.
2. Clients C2 and C3 see the key, assume it is expired, and both try SET. Both would succeed, causing a serious problem. With GETSET, the client that runs first gets the lock; the later client sees a non‑expired value and fails to acquire the lock.
Lock Release Analysis
Q1: Why check whether the key is expired before releasing the lock instead of simply calling DEL for higher performance?
A1: Consider this scenario: C1 acquires the lock and later gets blocked; C2 later acquires the lock after the key expires; when C1 finally wakes up it blindly executes DEL, unintentionally releasing C2’s lock. Then C3 can acquire the lock immediately, leading to incorrect behavior. Therefore the release process must first verify that the stored value has not expired; if it has, the delete is ignored. This two‑step check improves safety, though a small window still exists.
Facing Its Shortcomings
Q1: How to renew a lock when its expiration time is shorter than the business operation?
A1: This feature is not yet implemented. A library called Redisson provides automatic renewal, and we are considering switching to it.
Q2: How is high availability achieved?
A2: We use a failover mechanism. When initializing the Redis lock we maintain a connection pool and write to multiple Redis nodes (multi‑write) for consistency. If a node becomes unavailable, the client switches to another node. However, because the multi‑write is not atomic, a network issue can let two clients obtain the lock simultaneously. Some services mitigate this by using a database unique index; we plan to fix this bug in the future.
Conclusion
Hope this helps some readers; feel free to discuss respectfully.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Programmer DD
A tinkering programmer and author of "Spring Cloud Microservices in Action"
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
