ZooKeeper Core Knowledge and Typical Application Scenarios
Although many platforms are dropping ZooKeeper, this guide explains its CP‑oriented architecture, znode structure, watcher mechanism, Zab consensus, leader election, and common patterns such as publish/subscribe, load balancing, naming, master election, distributed locks and queues, giving architects essential fundamentals for coordination services.
In recent years, major tech companies have been abandoning ZooKeeper (ZK), with even Kafka dropping it after version 2.8. However, understanding ZK's core concepts remains valuable for architects and developers. This article provides comprehensive coverage of ZooKeeper's fundamental knowledge and practical applications.
1. ZooKeeper Overview
ZooKeeper is a CP-oriented distributed coordination framework that encapsulates complex distributed consistency services into efficient primitives. It enables features like data publish/subscribe, load balancing, naming services, cluster management, Master election, distributed locks, and distributed queues.
2. Data Structure - Znode
ZK maintains a tree-like file system structure in memory called znode. Each znode has four key properties:
public class DataNode implements Record {
byte data[];
Long acl;
public StatPersisted stat;
private Set<String> children = null;
}- data : Business data (parent nodes cannot store data) - children : Child node references (limited by memory) - stat : Status info including transaction ID, version, timestamp - acl : Access permissions
Important notes: Znode operations are atomic, max data size is 1MB (should be kept small), and each ZNode has a unique path.
2.1.1 Transaction ID (Zxid)
Zxid consists of two parts: epoch (high 32 bits) - the leader's term number that increments with each new leader election; and transaction counter (low 32 bits) - increments with each data change. Larger Zxid indicates newer data, crucial for data consistency and leader election.
2.1.2 Znode Types
Four types based on lifecycle: PERSISTENT (permanent nodes), EPHEMERAL (temporary nodes tied to client session), PERSISTENT_SEQUENTIAL, and EPHEMERAL_SEQUENTIAL. Sequential nodes append an incrementing sequence number useful for distributed locks.
2.2 Watcher Mechanism
ZK allows clients to register Watchers on nodes to listen for data changes, deletions, and child node changes. The mechanism involves three parts: ZK server, ZK client, and client's WatchManager. Key characteristics: one-time notification, sequential callback execution, minimal event structure (status, type, path only), and validity tied to session.
2.3 ZK Cluster
ZK uses cluster deployment with three roles: Leader (handles all writes, one active at a time), Follower (responds to reads, forwards writes to Leader, participates in voting), and Observer (read-only, doesn't vote - introduced in 3.3.0 to improve scalability). Server states: LOOKING, LEADING, FOLLOWING, OBSERVING.
2.3.1 Zab Protocol
ZooKeeper Atomic Broadcast ensures data consistency using two-phase commit: 1) Leader broadcasts proposal to Followers and waits for ACK from majority; 2) Leader broadcasts COMMIT to apply the proposal. This achieves eventual consistency (specifically sequential consistency). The protocol uses outstandingProposals (ConcurrentHashMap keyed by ZXID) to ensure strict transaction ordering.
2.3.2 Leader Election
ZK uses FastLeaderElection algorithm. Key parameters: myid (server ID from file), zxid (transaction ID - higher wins), and epoch-logicalclock (voting round). Election scenarios include: initial startup, Follower restart, and Leader restart. ZK prevents split-brain syndrome by requiring majority quorum (more than half nodes must vote for a Leader).
3. Typical Application Scenarios
3.1 Data Publish/Subscribe : Using Watcher mechanism - clients register watchers and pull data on startup; when data changes, ZK notifies clients who then pull latest data.
3.2 Load Balancing : Using temporary nodes and Watcher - service providers register as ephemeral nodes; consumers get service list and register watchers; changes trigger cache updates.
3.3 Naming Service : Using znode paths as names (e.g., Dubbo uses /dubbo/com.foo.BarService/providers/consumers structure).
3.4 Cluster Management : Using ephemeral nodes and Watcher to monitor cluster status.
3.5 Master Election : Multiple clients compete to create the same ephemeral node; only one succeeds and becomes Master; others watch parent node for re-election when Master fails.
3.6 Distributed Lock : Using sequential ephemeral nodes - clients create sequential nodes, check if they're first in order (lock holder), otherwise watch previous node. This prevents thundering herd problem.
3.7 Distributed Queue : Similar to distributed lock - using sequential nodes for FIFO ordering, consumers compete to process smallest sequence number.
Tencent Cloud Developer
Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.