Operations 16 min read

ZooKeeper Overview: Architecture, Data Model, Sessions, Watches, Consistency, and Leader Election

This article provides a comprehensive overview of ZooKeeper, covering its design goals, hierarchical data model, session handling, watch mechanism, consistency guarantees, leader election process, and the Zab protocol that underpins its distributed coordination capabilities.

Selected Java Interview Questions

Jul 11, 2020

ZooKeeper Overview

ZooKeeper is an open-source distributed application coordination service that provides a simple set of primitives for synchronization, configuration maintenance, and naming services.

Design Goals

Strong Consistency: Clients see the same view regardless of which server they connect to.

Reliability: Once a message is accepted by one server, it will be accepted by all servers.

Timeliness: Clients receive updates or failure notifications within a bounded time, but simultaneous delivery is not guaranteed; callers should invoke sync() before reading for freshest data.

Wait‑free: Slow or failed clients do not block fast clients.

Atomicity: Updates either succeed completely or fail.

Ordering: Global and per‑client ordering of operations.

Data Model

ZooKeeper maintains a hierarchical namespace similar to a file system, where each node (znode) is identified by a unique path.

Key characteristics of znodes:

Each child node is uniquely identified by its full path (e.g., /NameService/Server1).

Znodes can have children and store data; EPHEMERAL nodes cannot have children.

Each znode is versioned; multiple data versions are kept and the version number increments automatically.

Node types: Persistent, Ephemeral, Non‑sequential, Sequential.

Znodes can be watched for data changes or child‑list changes.

Every state change generates a globally ordered zxid (ZooKeeper Transaction Id).

Session

A client establishes a session with the ZooKeeper ensemble; the session state transitions are illustrated in the diagram.

If a client loses connection due to timeout, it enters CONNECTING state and attempts reconnection; if reconnection occurs within the session timeout, the state returns to CONNECTED.

Note: Session expiration is decided by the server, not the client.

Watch Mechanism

Watches are one‑time triggers that notify the client when the watched data changes.

Key points:

One‑time trigger: After a change, the watch fires once; the client must set a new watch for subsequent changes.

Sent to the client: Watches are delivered asynchronously over the socket; ordering guarantees ensure a client sees the watch before the corresponding data change.

Watched data: Data watches (getData, exists) and child watches (getChildren) are separate.

Operations such as setData, create, and delete trigger the appropriate watches on the affected znodes and their parents.

Consistency Guarantees

ZooKeeper provides sequential consistency, atomicity, a single system image, reliability, and timeliness.

How ZooKeeper Works

Servers assume one of three roles (leader, follower, observer) and four states (looking, leading, following, observing). The core protocol is Zab (ZooKeeper Atomic Broadcast), which operates in recovery and broadcast modes.

Leader election uses a Paxos‑based algorithm (basic Paxos or fast Paxos). The basic Paxos steps are enumerated.

Leader Workflow

The leader recovers data, maintains heartbeats with followers, and processes follower messages (PING, REQUEST, ACK, REVALIDATE).

Follower Workflow

Followers send requests to the leader, handle leader messages, forward client write requests, and return results. Message types include PING, PROPOSAL, COMMIT, UPTODATE, REVALIDATE, SYNC.

Zab: Broadcasting State Updates

Requests are forwarded to the leader, which broadcasts them using a two‑phase commit: PROPOSAL, ACK, and COMMIT. The protocol guarantees ordered execution across the ensemble and handles leader crashes by ensuring committed transactions are not lost and uncommitted proposals are discarded.

Summary

The article briefly introduces ZooKeeper’s principles, data model, session and watch mechanisms, consistency guarantees, leader election, leader/follower workflows, and the Zab protocol.

References are listed.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

consistency data-model Leader Election ZAB Protocol Coordination Service

Written by

Selected Java Interview Questions

A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.