Mastering ZooKeeper: Core Concepts, Distributed Locks, and Cluster Management
This article explains ZooKeeper's role as a distributed coordination service, covering its architecture, znode types, watch mechanism, configuration management, distributed locking, queue handling, data replication, leader election, and synchronization processes for building reliable backend systems.
1. What is ZooKeeper?
ZooKeeper is an open‑source distributed coordination service, an implementation of Google’s Chubby. It acts as a cluster manager, monitoring node states and providing simple interfaces with high performance and stability.
2. What does ZooKeeper provide?
File‑system‑like hierarchical namespace
Notification (watch) mechanism
3. ZooKeeper File System
ZooKeeper offers a multi‑level node namespace (znodes). Unlike a traditional file system, each node can store associated data, and the entire tree is kept in memory for high throughput and low latency, with a per‑node data limit of 1 MB.
4. Four Types of Znode
PERSISTENT
Remains after the client disconnects.
PERSISTENT_SEQUENTIAL
Remains after disconnect; ZooKeeper adds a sequential number to the name.
EPHEMERAL
Deleted automatically when the client disconnects.
EPHEMERAL_SEQUENTIAL
Deleted on disconnect and receives a sequential number.
5. ZooKeeper Notification Mechanism
Clients set a watcher on a znode; when the znode changes, ZooKeeper sends a one‑time notification to the client, which can then react to the change.
6. What does ZooKeeper do?
Naming service
Configuration management
Cluster management
Distributed lock
Queue management
7. Naming Service (File System)
Creates a global unique path in ZooKeeper that can be used as a name to locate resources or services across the cluster.
8. Configuration Management (File System & Watch)
Application configuration is stored in znodes; when a configuration changes, watchers notify clients to update their settings.
9. Cluster Management (File System & Watch)
Machines create temporary znodes under a parent directory to signal presence. When a machine fails, its temporary node disappears, notifying others of the change; new machines join similarly. Leader election can be performed by creating sequential temporary nodes and selecting the smallest.
10. Distributed Lock (File System & Watch)
Locks are implemented by creating a znode (e.g., /distribute_lock) as an exclusive lock or by creating sequential temporary nodes under a lock node and granting the lock to the node with the smallest sequence number.
11. Distributed Lock Acquisition Process
Clients create a temporary sequential node under locker, list all children, and if their node has the smallest sequence they acquire the lock; otherwise they watch the next smaller node and repeat until they become the smallest.
12. Queue Management (File System & Watch)
Two queue types are supported: synchronous queues that wait for all members, and FIFO queues where producers create sequential nodes and consumers delete the node with the smallest sequence number.
13. Data Replication
ZooKeeper replicates data across the cluster to provide fault tolerance, scalability, and performance. It supports Write‑Master (writes go to a designated leader) and Write‑Any (writes can go to any node) modes; ZooKeeper uses Write‑Any.
14. Working Principle
The core is atomic broadcast implemented by the Zab protocol, which has recovery (leader election) and broadcast (synchronization) modes.
15. Transaction Order Consistency
Each proposal receives a monotonically increasing 64‑bit zxid (epoch + counter) to guarantee total order.
16. Server States
LOOKING – searching for a leader
LEADING – acting as the elected leader
FOLLOWING – synchronizing with the leader
17. Leader Election
ZooKeeper uses either basic Paxos or fast Paxos (default) to elect a leader. The election process involves exchanging votes, selecting the server with the highest zxid, and requiring a majority.
18. Synchronization Process
Leader waits for followers to connect.
Followers send their highest zxid.
Leader determines a sync point.
Leader notifies followers they are up‑to‑date.
Followers resume serving client requests.
19. Distributed Notification and Coordination
Operators change node states via a console; ZooKeeper notifies all registered watchers, enabling real‑time progress monitoring.
20. Why a Leader Exists
Some tasks should be performed by a single machine to avoid duplicate work and improve performance, necessitating leader election.
21. Handling Node Failures
A ZooKeeper ensemble should have at least three servers; the cluster remains functional as long as a majority are up. If a follower fails, the remaining servers continue; if the leader fails, a new leader is elected.
22. ZooKeeper vs. Nginx Load Balancing
ZooKeeper’s load balancing is programmable, while Nginx provides static weight‑based balancing with higher throughput.
23. Watch Mechanism
Watches are one‑time triggers that notify clients of data changes. They are lightweight, asynchronous, and can be set on data or child nodes. Certain edge cases may cause a watch to be lost, such as rapid create‑delete cycles while a client is disconnected.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java Backend Technology
Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
