Fundamentals 18 min read

Understanding Apache ZooKeeper: Concepts, Modes, Data Model, and Practical Docker‑Compose Deployment

This article introduces Apache ZooKeeper as a high‑availability coordination service, explains its three deployment modes, core roles, hierarchical data model and node types, outlines common use cases such as configuration management, naming service, distributed locks and queues, and provides a step‑by‑step Docker‑Compose setup with example commands and configuration files.

Top Architect
Top Architect
Top Architect
Understanding Apache ZooKeeper: Concepts, Modes, Data Model, and Practical Docker‑Compose Deployment

What is ZooKeeper

ZooKeeper is an Apache top‑level project that provides efficient, highly available distributed coordination services such as data publish/subscribe, load balancing, naming, distributed notifications, and distributed locks. It is widely used in large systems like Hadoop, HBase, Kafka, and Dubbo.

Running Modes

Standalone mode : suitable for development and testing when resources are limited.

Cluster mode : a production‑grade ensemble of three or more machines, each maintaining server state and communicating with peers.

Pseudo‑cluster mode : multiple ZooKeeper instances run on a single machine using different ports, useful when a single powerful host is available.

ZooKeeper Roles

Leader : initiates elections and updates system state.

Follower : handles client requests and participates in elections.

Observer : receives client connections and forwards writes to the leader but does not vote, improving read scalability.

Data Model

ZooKeeper stores data in a hierarchical namespace similar to a Linux file system. Each node, called a znode , has a unique path, can hold data and children (except EPHEMERAL nodes), supports versioned data, and allows watchers to be set on nodes. Reads and writes are atomic on the whole node.

Node Types

PERSISTENT : remains until explicitly deleted.

EPHEMERAL : disappears when the client session ends.

SEQUENTIAL : automatically appends a monotonically increasing sequence number.

Typical Application Scenarios

Configuration Center (Publish/Subscribe) : services publish configuration data to znodes; clients watch for changes to obtain dynamic updates.

Distributed Log Collection : log collectors register their IPs under a common path, enabling real‑time task redistribution when machines change.

Load Balancing : soft load balancing by having multiple service instances and letting clients select one, often used in messaging systems.

Naming Service : frameworks like Dubbo store service address lists in ZooKeeper paths, allowing dynamic discovery.

Distributed Locks : exclusive locks are implemented by creating a lock znode; sequential locks use EPHEMERAL_SEQUENTIAL nodes to enforce ordering.

Distributed Queues : FIFO queues map to znodes; ordered queues wait until a predefined number of participants are present before processing.

Deploying a ZooKeeper Cluster with Docker‑Compose

The following docker-compose.yml defines three ZooKeeper containers, each exposing a different host port (2181‑2183) and setting the required environment variables ZOO_MY_ID and ZOO_SERVERS :

version: '3.4'

services:
  zoo1:
    image: zookeeper
    restart: always
    hostname: zoo1
    ports:
      - 2181:2181
    environment:
      ZOO_MY_ID: 1
      ZOO_SERVERS: server.1=0.0.0.0:2888:3888;2181 server.2=zoo2:2888:3888;2181 server.3=zoo3:2888:3888;2181

  zoo2:
    image: zookeeper
    restart: always
    hostname: zoo2
    ports:
      - 2182:2181
    environment:
      ZOO_MY_ID: 2
      ZOO_SERVERS: server.1=zoo1:2888:3888;2181 server.2=0.0.0.0:2888:3888;2181 server.3=zoo3:2888:3888;2181

  zoo3:
    image: zookeeper
    restart: always
    hostname: zoo3
    ports:
      - 2183:2181
    environment:
      ZOO_MY_ID: 3
      ZOO_SERVERS: server.1=zoo1:2888:3888;2181 server.2=zoo2:2888:3888;2181 server.3=0.0.0.0:2888:3888;2181

Run docker-compose up in the directory containing the file to start the three‑node ensemble.

Connecting to ZooKeeper

After the cluster is up, use the ZooKeeper CLI to interact with it:

./zkCli.sh -server 127.0.0.1:2181

Typical commands:

ls / – list top‑level znodes.

create /zk myData – create a new znode with data.

get /zk – retrieve the data and metadata of the znode.

delete /zk – remove the znode.

The CLI output shows session establishment, node metadata (cZxid, mtime, version numbers, etc.), and confirms successful operations.

Next Steps

The article mentions that subsequent posts will demonstrate code implementations for the listed use cases. The source project is available on GitHub (github.com/modouxiansheng/about-docker/tree/master/ZooKeeper) and can be launched with the two steps: clone the repository and run docker-compose up inside the ZooKeeper directory.

configuration managementZookeeperclusterDistributed CoordinationZNodeDocker-Compose
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.