Big Data 23 min read

Mastering ZooKeeper: Core Concepts and Real-World Big Data Applications

This article explains ZooKeeper’s architecture, key concepts such as roles, sessions, ZNodes, versioning, ACLs, and watchers, and demonstrates how it powers essential big‑data components like Hadoop’s ResourceManager and HBase’s master election, naming service, and distributed locking.

ITPUB
ITPUB
ITPUB
Mastering ZooKeeper: Core Concepts and Real-World Big Data Applications

Overview

ZooKeeper is an open‑source distributed coordination service originally created by Yahoo as an implementation of Google’s Chubby. It provides strong consistency, a hierarchical in‑memory data model (ZNodes), and a lightweight watch mechanism that enables data publish/subscribe, naming, leader election, distributed locks, and other coordination patterns.

Basic Concepts

Cluster Roles and Configuration

Leader – the only server that processes write requests.

Follower – serves read requests and participates in leader election.

Observer – read‑only server that does not vote; enabled by adding peerType=observer to the server line in zoo.cfg (e.g., server.1:localhost:2888:3888:observer).

All servers share the same zoo.cfg file; the only per‑node difference is the myid file, which must contain the numeric identifier used in the server.{id}=... entry.

Use zookeeper-server status on a node to display its role (Leader, Follower, or Observer).

Session

A client establishes a long‑lived TCP connection (default port 2181). The session starts on connection and expires after sessionTimeout if the client cannot reconnect to any server in the ensemble.

ZNode

Data is stored in a tree of ZNodes identified by paths such as /hbase/master. ZNodes can be:

Persistent – remain until explicitly deleted.

Ephemeral (temporary) – automatically removed when the creating client’s session expires.

The SEQUENTIAL flag can be added to a create request; ZooKeeper appends a monotonically increasing integer to the node name.

Versioning

Each ZNode has a Stat structure with three version counters: version – data version (used for optimistic locking). cversion – children version. aversion – ACL version.

Transaction IDs (ZXID)

Every state‑changing operation receives a globally unique 64‑bit transaction ID (ZXID) that defines a total order of updates.

Watcher

Clients can register a watcher on a ZNode. ZooKeeper sends a one‑time notification when the node’s data or children change, enabling asynchronous coordination.

Access Control List (ACL)

ZooKeeper defines five permissions: CREATE, READ, WRITE, DELETE, and ADMIN. CREATE and DELETE apply only to child nodes.

Typical Use Cases

Configuration Center (Publish/Subscribe)

Small, frequently changing configuration data is stored in ZNodes. Clients register watchers on the configuration node; when the data changes, ZooKeeper pushes a notification and the client pulls the latest value.

Naming Service

Creating a sequential ZNode yields a globally unique name that can be used as a service identifier or RPC endpoint.

Distributed Coordination / Notification

Multiple processes register watchers on the same ZNode. Any change triggers a notification to all watchers, allowing real‑time coordination.

Master Election

Clients compete to create a designated temporary ZNode (e.g., /master_election). The client that succeeds becomes the master; others watch the node and re‑elect when it disappears.

Distributed Lock

Exclusive (Write) Lock Define a lock path, e.g., /exclusive_lock . Each client attempts to create an ephemeral child node /exclusive_lock/lock . ZooKeeper guarantees that only one client succeeds, thereby acquiring the lock. All other clients set a watcher on /exclusive_lock/lock to be notified when the lock is released (node deletion).

Shared (Read) Lock Clients create distinct ephemeral nodes under /shared_lock/ . The lock is considered held in shared mode as long as at least one such node exists. A client may acquire an exclusive lock only after all shared lock nodes have been removed.

ZooKeeper in Large‑Scale Systems

Hadoop

ZooKeeper provides high availability for HDFS NameNode and YARN ResourceManager. Both components use a lock node such as /yarn-leader-election/appcluster-yarn/ActiveBreadCrumb (ephemeral). The ResourceManager that successfully creates this node becomes Active ; the others remain Standby and register a watcher on the node to detect failover.

YARN RMStateStore can be persisted in ZooKeeper under /rmstore, with sub‑nodes like /rmstore/RMAppRoot and /rmstore/RMDTSecretManagerRoot.

HBase

Master Election & HA – identical to Hadoop’s leader election using a temporary node (e.g., /hbase/master).

RegionServer Fault Detection – each RegionServer creates an ephemeral node under /hbase/rs/[hostname]. HMaster watches this path; deletion of a child node indicates a RegionServer failure.

RootRegion Location – stored in /hbase/meta-region-server. Changes are detected via watchers to keep clients aware of the current RootRegion.

Region State Management – Region transitions (offline/online) are coordinated through ZNodes; the state is visible to the whole cluster.

Distributed SplitWAL – HMaster creates a persistent node /hbase/SplitWAL containing a list of WAL split tasks. RegionServers claim tasks by updating this node, enabling parallel log recovery.

Summary

ZooKeeper’s strong consistency, hierarchical ZNode model, and one‑time watch notifications make it a versatile backbone for coordination, configuration management, naming, leader election, and distributed locking in large‑scale data platforms such as Hadoop and HBase.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Big DataZooKeeperHBasedistributed-lockHadoopDistributed CoordinationMaster Election
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.