Operations 18 min read

Mastering ZooKeeper: Modes, Roles, Data Model, and Docker-Compose Cluster Setup

This article introduces Apache ZooKeeper’s core concepts—including its three deployment modes, node roles, hierarchical data model, and common use cases—then provides a step‑by‑step guide to building a ZooKeeper cluster with Docker‑Compose, configuring the environment, and performing basic CLI operations.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
Mastering ZooKeeper: Modes, Roles, Data Model, and Docker-Compose Cluster Setup

What is ZooKeeper

ZooKeeper is an Apache top‑level project that provides efficient, highly available distributed coordination services for applications, offering features such as data publish/subscribe, load balancing, naming service, distributed coordination/notification, and distributed locks. It is widely used in large systems like Hadoop, HBase, Kafka, and Dubbo.

ZooKeeper Running Modes

Standalone mode: suitable for development and testing environments where resources are limited and high stability is not required.

Cluster mode: a ZooKeeper ensemble typically consists of three or more machines; each server maintains its state in memory and communicates with the others.

Pseudo‑cluster mode: all servers run on a single machine using different ports, allowing multiple ZooKeeper instances to provide cluster‑like services without additional hardware.

ZooKeeper Roles

Leader: initiates and decides on elections, updates system state.

Follower: handles client requests, returns results, and participates in voting.

Observer: receives client connections and forwards write requests to the leader but does not vote, improving read throughput.

ZooKeeper Data Model

Hierarchical directory structure similar to a Linux file system.

Each node is called a Znode and is identified by a unique path.

Znodes can contain data and child nodes; EPHEMERAL nodes cannot have children.

Data stored in a Znode can have multiple versions; reading a version requires specifying it.

Clients can set watches on nodes.

Read/write operations are atomic; partial reads/writes are not supported.

ZooKeeper Node Types

PERSISTENT: remains until explicitly deleted.

EPHEMERAL: disappears when the client session that created it ends.

SEQUENTIAL: nodes are created with an increasing sequence number.

Default nodes are unordered.

Typical Application Scenarios

ZooKeeper provides strong consistency via the Paxos‑based algorithm, making it suitable for configuration centers, distributed log collection, service naming (e.g., Dubbo), load balancing, distributed notifications/coordination, distributed locks, and distributed queues.

Configuration Center (Publish/Subscribe)

Publishers write configuration data to ZooKeeper nodes; subscribers watch those nodes to receive real‑time updates, such as global settings or service address lists.

Distributed Log Collection

Log collectors register each application’s IP as a child node under a path representing the application, enabling dynamic task redistribution when machines change.

Service Naming (Dubbo Example)

Providers write their URLs to /dubbo/${serviceName}/providers; consumers subscribe to this path and also register under /dubbo/${serviceName}/consumers. All registrations use EPHEMERAL nodes so changes are detected automatically.

Distributed Notification / Coordination

Watchers on a Znode receive notifications when the node or its children change, allowing systems to react to data updates in real time.

Distributed Lock

Exclusive lock: clients attempt to create a designated Znode; the one that succeeds holds the lock.

Sequenced lock: clients create EPHEMERAL_SEQUENTIAL nodes under a lock parent; the node with the smallest sequence obtains the lock, and upon release the next node is notified.

Distributed Queue

Two types: a simple FIFO queue and a barrier queue that starts processing only after a predefined number of participants have joined.

Setting Up a ZooKeeper Cluster with Docker‑Compose

Create a directory containing docker-compose.yml:

├── docker-compose.yml

docker-compose.yml content:

version: '3.4'

services:
  zoo1:
    image: zookeeper
    restart: always
    hostname: zoo1
    ports:
      - 2181:2181
    environment:
      ZOO_MY_ID: 1
      ZOO_SERVERS: server.1=0.0.0.0:2888:3888;2181 server.2=zoo2:2888:3888;2181 server.3=zoo3:2888:3888;2181

  zoo2:
    image: zookeeper
    restart: always
    hostname: zoo2
    ports:
      - 2182:2181
    environment:
      ZOO_MY_ID: 2
      ZOO_SERVERS: server.1=zoo1:2888:3888;2181 server.2=0.0.0.0:2888:3888;2181 server.3=zoo3:2888:3888;2181

  zoo3:
    image: zookeeper
    restart: always
    hostname: zoo3
    ports:
      - 2183:2181
    environment:
      ZOO_MY_ID: 3
      ZOO_SERVERS: server.1=zoo1:2888:3888;2181 server.2=zoo2:2888:3888;2181 server.3=0.0.0.0:2888:3888;2181

Run docker-compose up in the directory to start the three ZooKeeper containers.

Configure zoo.cfg (or use defaults) with parameters such as tickTime, initLimit, syncLimit, dataDir, clientPort, and the server list server.N=IP:A:B.

Connect to the cluster using the client CLI:

./bin/zkCli.sh -server 127.0.0.1:2181

Basic CLI commands: ls / – list root nodes. create /zk myData – create a Znode. get /zk – retrieve data and metadata. delete /zk – remove the Znode.

Further articles will demonstrate code implementations for the scenarios described above.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Distributed SystemsZooKeeper
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.