Mastering ZooKeeper Distributed Locks: Principles, Implementation, and Best Practices
This article explains how ZooKeeper provides reliable distributed locks using ordered temporary nodes and watchers, details the underlying mechanisms, step‑by‑step implementation in Java, compares Curator framework options, and discusses the advantages, drawbacks, and real‑world use cases.
ZooKeeper distributed lock is a classic scenario in distributed systems, addressing mutual exclusion of shared resources across multiple processes or nodes. Leveraging ZooKeeper’s strong consistency, ordering, and temporary node features, it provides a reliable lock solution.
The core mechanism uses ordered temporary nodes and a watcher mechanism. A client creates an ordered temporary child node under a designated directory and checks if its node has the smallest sequence number; if so, it acquires the lock, otherwise it watches the node with the next smaller sequence number and waits for its deletion.
This approach ensures lock reliability even if a client crashes (temporary nodes are automatically removed), guarantees fairness by allocating locks in request order, and supports reentrancy and blocking lock features.
1. ZooKeeper Distributed Lock Implementation Principle
ZooKeeper distributed lock relies on the following key features:
1.1 Ephemeral Node
Ephemeral nodes are bound to the client session; they are automatically deleted when the session ends, ensuring that locks are released if a client crashes, thus preventing deadlocks.
1.2 Sequential Node
Sequential nodes are created with an auto‑incremented suffix, preserving creation order. This enables locks to be granted in request order, achieving fairness.
1.3 Watcher
Watchers allow clients to receive notifications when a node’s state changes (e.g., deletion), enabling waiting clients to be promptly informed when the lock becomes available.
2. Steps to Implement ZooKeeper Distributed Lock
The basic workflow is as follows:
Create a persistent parent node for locks, e.g., /locks.
When a client wants a lock, create an ordered temporary child node under the parent, e.g., /locks/lock_000000001.
Retrieve all child nodes under the parent and sort them by sequence number.
If the client’s node has the smallest sequence number, the lock is acquired.
If not, watch the node with the next smaller sequence number.
When the watched node is deleted, the client acquires the lock.
After completing business logic, delete its own node to release the lock.
Below is a Java implementation using ZooKeeper:
import org.apache.zookeeper.*;
import org.apache.zookeeper.data.Stat;
import java.io.IOException;
import java.util.Collections;
import java.util.List;
import java.util.concurrent.CountDownLatch;
public class ZKDistributedLock {
private ZooKeeper zooKeeper;
private String lockPath = "/distributed_locks";
private String currentLockPath;
private String waitLockPath;
private CountDownLatch countDownLatch = new CountDownLatch(1);
public ZKDistributedLock(String connectString) throws IOException, InterruptedException, KeeperException {
// Connect to ZooKeeper server
zooKeeper = new ZooKeeper(connectString, 5000, new Watcher() {
@Override
public void process(WatchedEvent event) {
// Connection established, release waiting thread
if (event.getState() == Event.KeeperState.SyncConnected) {
System.out.println("已连接到ZooKeeper服务器");
countDownLatch.countDown();
}
// Watch node deletion event
if (event.getType() == Event.EventType.NodeDeleted && event.getPath().equals(waitLockPath)) {
System.out.println("前一个锁节点已释放,准备获取锁");
try {
tryLock();
} catch (Exception e) {
e.printStackTrace();
}
}
}
});
// Wait for connection to be established
countDownLatch.await();
// Ensure lock directory exists
Stat stat = zooKeeper.exists(lockPath, false);
if (stat == null) {
zooKeeper.create(lockPath, new byte[0], ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT);
System.out.println("创建锁目录: " + lockPath);
}
}
public boolean lock() throws KeeperException, InterruptedException {
// Create temporary sequential node
currentLockPath = zooKeeper.create(lockPath + "/lock_", new byte[0],
ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.EPHEMERAL_SEQUENTIAL);
System.out.println("创建锁节点: " + currentLockPath);
return tryLock();
}
private boolean tryLock() throws KeeperException, InterruptedException {
// Get all child nodes
List<String> children = zooKeeper.getChildren(lockPath, false);
Collections.sort(children);
// Get current node name and position
String currentNode = currentLockPath.substring(lockPath.length() + 1);
int index = children.indexOf(currentNode);
if (index == 0) {
// If first node, lock acquired
System.out.println("获取锁成功,当前锁节点: " + currentLockPath);
return true;
} else {
// Watch previous node
String previousNode = children.get(index - 1);
waitLockPath = lockPath + "/" + previousNode;
Stat stat = zooKeeper.exists(waitLockPath, true);
if (stat == null) {
// If previous node no longer exists, retry
System.out.println("前一个节点已不存在,重新尝试获取锁");
return tryLock();
} else {
System.out.println("等待前一个锁释放: " + waitLockPath);
return false;
}
}
}
public void unlock() throws KeeperException, InterruptedException {
if (currentLockPath != null) {
zooKeeper.delete(currentLockPath, -1);
System.out.println("释放锁: " + currentLockPath);
currentLockPath = null;
}
}
public void close() throws InterruptedException {
if (zooKeeper != null) {
zooKeeper.close();
}
}
}3. Curator Framework Implementation of Distributed Locks
In practice, developers often use the Curator framework to simplify ZooKeeper lock implementation. Curator provides several lock types:
Shared Lock : basic exclusive lock.
Shared Reentrant Read Write Lock : allows multiple readers but only one writer.
Shared Reentrant Lock : permits the same client to acquire the same lock multiple times.
Shared Semaphore : controls the number of clients that can access a resource concurrently.
4. Advantages and Disadvantages of ZooKeeper Distributed Locks
4.1 Advantages
High reliability : ZooKeeper’s high availability ensures stable lock service.
Automatic fault recovery : temporary nodes are removed when a client crashes, releasing the lock.
Fair lock implementation : ordered nodes guarantee fairness and avoid starvation.
Reentrancy : supports re‑entrant locks for complex scenarios.
Read‑write lock support : enables read‑write separation to improve concurrency.
4.2 Disadvantages
Performance overhead : ZooKeeper operations involve network communication, incurring higher acquisition and release costs compared to local locks.
Herd effect : improper implementation may cause all waiting clients to contend for the lock simultaneously when it is released.
Session timeout issues : network partitions can lead to session timeouts and accidental lock release.
Resource consumption : frequent creation and deletion of temporary nodes increase ZooKeeper load.
5. Real‑World Use Cases for Distributed Locks
Distributed task scheduling : ensures a scheduled job runs on only one node.
Distributed counters : concurrent updates to a shared counter across nodes.
Distributed ID generation : guarantees uniqueness and ordering of generated IDs.
Distributed transactions : coordinates commit and rollback across services.
Resource contention control : manages concurrent access to shared resources.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Xuanwu Backend Tech Stack
Primarily covers fundamental Java concepts, mainstream frameworks, deep dives into underlying principles, and JVM internals.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
