Understanding Apache ZooKeeper: Concepts, Modes, Data Model, and Practical Docker‑Compose Deployment
This article introduces Apache ZooKeeper as a high‑availability coordination service, explains its three deployment modes, core roles, hierarchical data model and node types, outlines common use cases such as configuration management, naming service, distributed locks and queues, and provides a step‑by‑step Docker‑Compose setup with example commands and configuration files.
What is ZooKeeper
ZooKeeper is an Apache top‑level project that provides efficient, highly available distributed coordination services such as data publish/subscribe, load balancing, naming, distributed notifications, and distributed locks. It is widely used in large systems like Hadoop, HBase, Kafka, and Dubbo.
Running Modes
Standalone mode : suitable for development and testing when resources are limited.
Cluster mode : a production‑grade ensemble of three or more machines, each maintaining server state and communicating with peers.
Pseudo‑cluster mode : multiple ZooKeeper instances run on a single machine using different ports, useful when a single powerful host is available.
ZooKeeper Roles
Leader : initiates elections and updates system state.
Follower : handles client requests and participates in elections.
Observer : receives client connections and forwards writes to the leader but does not vote, improving read scalability.
Data Model
ZooKeeper stores data in a hierarchical namespace similar to a Linux file system. Each node, called a znode , has a unique path, can hold data and children (except EPHEMERAL nodes), supports versioned data, and allows watchers to be set on nodes. Reads and writes are atomic on the whole node.
Node Types
PERSISTENT : remains until explicitly deleted.
EPHEMERAL : disappears when the client session ends.
SEQUENTIAL : automatically appends a monotonically increasing sequence number.
Typical Application Scenarios
Configuration Center (Publish/Subscribe) : services publish configuration data to znodes; clients watch for changes to obtain dynamic updates.
Distributed Log Collection : log collectors register their IPs under a common path, enabling real‑time task redistribution when machines change.
Load Balancing : soft load balancing by having multiple service instances and letting clients select one, often used in messaging systems.
Naming Service : frameworks like Dubbo store service address lists in ZooKeeper paths, allowing dynamic discovery.
Distributed Locks : exclusive locks are implemented by creating a lock znode; sequential locks use EPHEMERAL_SEQUENTIAL nodes to enforce ordering.
Distributed Queues : FIFO queues map to znodes; ordered queues wait until a predefined number of participants are present before processing.
Deploying a ZooKeeper Cluster with Docker‑Compose
The following docker-compose.yml defines three ZooKeeper containers, each exposing a different host port (2181‑2183) and setting the required environment variables ZOO_MY_ID and ZOO_SERVERS :
version: '3.4'
services:
zoo1:
image: zookeeper
restart: always
hostname: zoo1
ports:
- 2181:2181
environment:
ZOO_MY_ID: 1
ZOO_SERVERS: server.1=0.0.0.0:2888:3888;2181 server.2=zoo2:2888:3888;2181 server.3=zoo3:2888:3888;2181
zoo2:
image: zookeeper
restart: always
hostname: zoo2
ports:
- 2182:2181
environment:
ZOO_MY_ID: 2
ZOO_SERVERS: server.1=zoo1:2888:3888;2181 server.2=0.0.0.0:2888:3888;2181 server.3=zoo3:2888:3888;2181
zoo3:
image: zookeeper
restart: always
hostname: zoo3
ports:
- 2183:2181
environment:
ZOO_MY_ID: 3
ZOO_SERVERS: server.1=zoo1:2888:3888;2181 server.2=zoo2:2888:3888;2181 server.3=0.0.0.0:2888:3888;2181Run docker-compose up in the directory containing the file to start the three‑node ensemble.
Connecting to ZooKeeper
After the cluster is up, use the ZooKeeper CLI to interact with it:
./zkCli.sh -server 127.0.0.1:2181Typical commands:
ls / – list top‑level znodes.
create /zk myData – create a new znode with data.
get /zk – retrieve the data and metadata of the znode.
delete /zk – remove the znode.
The CLI output shows session establishment, node metadata (cZxid, mtime, version numbers, etc.), and confirms successful operations.
Next Steps
The article mentions that subsequent posts will demonstrate code implementations for the listed use cases. The source project is available on GitHub (github.com/modouxiansheng/about-docker/tree/master/ZooKeeper) and can be launched with the two steps: clone the repository and run docker-compose up inside the ZooKeeper directory.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.