Fundamentals 14 min read

Introduction to ZooKeeper: Architecture, Data Model, Sessions, Watches, Consistency, Leader Election, and Zab Protocol

This article provides a comprehensive overview of ZooKeeper, covering its purpose as a distributed coordination service, design goals, hierarchical data model, session handling, watch mechanism, consistency guarantees, leader election, role responsibilities, and the Zab atomic broadcast protocol.

Architecture Digest
Architecture Digest
Architecture Digest
Introduction to ZooKeeper: Architecture, Data Model, Sessions, Watches, Consistency, Leader Election, and Zab Protocol

ZooKeeper is an open‑source distributed coordination service that offers a simple set of primitives for synchronization, configuration maintenance, and naming services.

Its design goals emphasize strong consistency (the same view for any client), reliability (updates accepted by one server are accepted by all), real‑time guarantees, wait‑free behavior, atomicity, and global ordering of operations.

The data model is a hierarchical namespace similar to a file system; each node, called a znode, is uniquely identified by its path, may have children, store data, and maintain a version number. Znode types include Persistent, Ephemeral (deleted when the creating session ends), Non‑sequential, and Sequential (appended with a monotonically increasing decimal suffix).

Clients establish a session with the ZooKeeper ensemble; if a client loses connection, it remains in a CONNECTING state and attempts reconnection. Session expiration is determined by the server, not the client.

Watches are one‑time triggers set on read operations (getData, getChildren, exists). When the watched data changes, the server sends a notification to the client; watches are lightweight but can be lost if the client disconnects exactly when a node is created or deleted.

ZooKeeper guarantees sequential consistency, atomicity, a single system image, reliability, and timeliness, ensuring that all clients see a consistent view within a bounded time.

The core of ZooKeeper is the Zab (ZooKeeper Atomic Broadcast) protocol, which provides atomic broadcast via a two‑phase commit. Servers assume roles (leader, follower, observer) and states (LOOKING, LEADING, FOLLOWING, OBSERVING). Leader election uses either fast Paxos or basic Paxos, requiring a quorum of votes.

The leader’s responsibilities include data recovery, maintaining heartbeats with followers, processing client requests, and coordinating proposals. Followers forward client requests to the leader, handle proposals, commits, sync messages, and other protocol messages.

Zab ensures ordered execution of transactions across the ensemble: the leader sends a PROPOSAL, followers write it to disk and ACK, and the leader commits after receiving a quorum of ACKs. It also handles leader crashes by preserving committed transactions and discarding uncommitted proposals.

The article concludes with a brief recap of ZooKeeper’s principles, data model, session and watch mechanisms, consistency guarantees, leader election, role workflows, and the Zab protocol, followed by references.

Zookeeperconsistencydata modelDistributed CoordinationLeader ElectionZab Protocol
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.