Operations 13 min read

Mastering Zookeeper: From Basics to Advanced Coordination in Distributed Systems

This article provides a comprehensive guide to Zookeeper, covering its role in high‑concurrency distributed environments, core concepts, installation steps, key features such as ordering, replication, and watches, as well as practical command usage and session management.

Programmer DD

Jan 2, 2020

Mastering Zookeeper: From Basics to Advanced Coordination in Distributed Systems

After several Spring‑focused posts, I now turn to the vast topic of high‑concurrency distributed development, starting with Zookeeper as a fundamental coordination tool.

1. Challenges in a concurrent environment

Running multiple processes across a cluster raises questions such as keeping configuration consistent, handling node failures, scaling without restarting, and coordinating writes to shared network files.

2. Introduction to Zookeeper

① Origin of the name

Zookeeper follows the Apache tradition of using animal icons; its job is to coordinate the actions of these “animals”.

② Overview

Zookeeper is a high‑performance coordination service for distributed applications. Data resides in memory and is persisted to a log. It provides a tree‑like namespace for configuration, service registration, distributed locks, etc., and requires a quorum of servers to stay available.

③ Installation (Linux)

1. JDK version must be 1.6 or higher
2. Download: https://archive.apache.org/dist/zookeeper/zookeeper-3.5.2/zookeeper-3.5.2.tar.gz
3. Add a zoo.cfg file in the conf directory after extraction
4. Start the server: bin/zkServer.sh start
5. Test client connection: bin/zkCli.sh -server 127.0.0.1:2181

Key zoo.cfg parameters:
- tickTime=2000 (heartbeat interval)
- dataDir (data and log storage)
- clientPort (listening port)

④ Features

1. Simple data structure

Zookeeper stores data in a hierarchical tree of znodes, each of which can hold data like a file or act as a directory. Node names must be unique under the same parent, use absolute paths, and have size limits.

2. Data model characteristics

The namespace is a hierarchical naming space similar to a Unix file system, using absolute paths. Znode types include persistent, sequential, ephemeral, and ephemeral‑sequential.

3. Naming conventions

Node names may use any Unicode character except null, control characters, and a few reserved ranges. “.” can appear within a name but “.” and “..” are not allowed as standalone path components, and “zookeeper” is a reserved name.

4. Some commands

Typical commands include ls / to list the root, create /zk 123 to create a node, and others that require parent nodes to exist before creating children.

5. Ordered feature

Zxid: a globally ordered transaction ID for each write request.

Version numbers: dataVersion, cversion, and aclVersion track changes to data, children, and ACLs respectively.

Ticks: define time intervals for events such as session timeouts; the default timeout is twice the tickTime.

Real time: Zookeeper does not rely on wall‑clock time.

6. Replicable feature

Data is replicated across the ensemble, providing fault tolerance and eliminating single points of failure.

7. Fast feature

Zookeeper’s low latency and high throughput make it suitable for large‑scale distributed systems.

3. Zookeeper theory

① Session mechanism

1. A client establishes a session with a unique session ID assigned by Zookeeper.
2. The client sends periodic heartbeats to keep the session alive.
3. If no heartbeat is received within 2×tickTime, the session expires.
4. Requests within a session are processed in FIFO order.

② Znode data composition

Node data: basic information (state, config, location, etc.)
Node metadata: data returned by the stat command
Data size limit: 1 MB

③ Znode node types

1. Persistent node: created with <code>create path value</code>
2. Ephemeral node: created with <code>create -e path value</code>
3. Sequential node: created with <code>create -s path value</code>

Notes:
- Ephemeral nodes are removed when the session ends.
- Sequential nodes receive a 10‑digit decimal suffix; the counter overflows at 2 147 483 647.
- Sequential nodes persist after the session ends.

④ Watch mechanism

Clients can set watches on znodes to be notified of create, delete, change, or child events. Watches are one‑time triggers; after firing they are removed, so continuous monitoring requires re‑registration.

1. One‑time: watch is removed after it fires.
2. Ordering: client receives the watch notification before reading the change.

Watch considerations include possible latency, the fact that a single watch object is notified only once even if registered for multiple operations, and that network delays may affect reliability.

1. One‑time nature (as above).
2. Potential latency between event and watch notification.
3. A watch registered for multiple operations (e.g., exists and getData) is invoked only once per event.

⑤ Zookeeper properties

1. Sequential consistency: operations appear in order.
2. Atomicity: updates succeed completely or fail.
3. Single system image: all servers present the same view.
4. Reliability: changes are not lost unless overwritten.
5. Timeliness: reads return the latest data.

Through the above discussion, we have gained an initial understanding of Zookeeper. Future posts will cover distributed locks, cluster management, and various application scenarios.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

distributed systems Installation Coordination Service watch mechanism

Written by

Programmer DD

A tinkering programmer and author of "Spring Cloud Microservices in Action"

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.