Mastering Zookeeper: Installation, Configuration, and Real-World Use Cases
This article provides a comprehensive guide to Zookeeper, covering its purpose in distributed systems, step‑by‑step installation for both standalone and cluster modes, detailed configuration options, core data model, essential Java APIs, and typical scenarios such as naming service, configuration management, leader election, locks, and queue handling.
Installation and Configuration Details
Zookeeper 3.2.2 is used as an example; newer versions can be downloaded from the official Apache site. Installation is straightforward for both single‑node and cluster deployments.
Standalone Mode
Download the Zookeeper archive, extract it (e.g., to /home/zookeeper-3.2.2), and use the zkServer.sh script in the bin directory to start the service. Windows lacks a native script, so a custom batch file is required.
Listing 1. Windows Zookeeper startup script
setlocal
set ZOOCFGDIR=%~dp0%..\conf
set ZOO_LOG_DIR=%~dp0%..
set ZOO_LOG4J_PROP=INFO,CONSOLE
set CLASSPATH=%ZOOCFGDIR%
set CLASSPATH=%~dp0..\*;%~dp0..\lib\*;%CLASSPATH%
set CLASSPATH=%~dp0..\build\classes;%~dp0..\build\lib\*;%CLASSPATH%
set ZOOCFG=%ZOOCFGDIR%\zoo.cfg
set ZOOMAIN=org.apache.zookeeper.server.ZooKeeperServerMain
java "-Dzookeeper.log.dir=%ZOO_LOG_DIR%" "-Dzookeeper.root.logger=%ZOO_LOG4J_PROP%"
-cp "%CLASSPATH%" %ZOOMAIN% "%ZOOCFG%" %*
endlocalBefore starting, rename zoo_sample.cfg to zoo.cfg and edit the essential parameters:
tickTime=2000
dataDir=D:/devtools/zookeeper-3.2.2/build
clientPort=2181tickTime : heartbeat interval between servers or between client and server.
dataDir : directory where Zookeeper stores its data and transaction logs.
clientPort : TCP port on which the server listens for client connections.
After configuring, start Zookeeper and verify that the clientPort is listening (e.g., using netstat -ano).
Cluster Mode
Zookeeper can run as a multi‑node ensemble. In addition to the three basic settings, the following entries are required:
initLimit=5
syncLimit=2
server.1=192.168.211.1:2888:3888
server.2=192.168.211.2:2888:3888initLimit : maximum number of tick intervals allowed for a follower to connect to the leader during initialization.
syncLimit : maximum number of tick intervals for leader–follower communication.
server.A=B:C:D : defines each server’s ID (A), IP address (B), peer communication port (C), and election port (D).
Each server also needs a myid file placed in dataDir containing its numeric ID (the same A value used above).
Data Model
Zookeeper maintains a hierarchical namespace similar to a file system. Each node (znode) is identified by a unique path.
Figure 1. Zookeeper data structure
Key characteristics of znodes:
Each znode is uniquely identified by its path (e.g., /NameService/Server1).
Znodes can have children and store data; EPHEMERAL nodes cannot have children.
Data is versioned, allowing multiple revisions.
Ephemeral nodes are automatically removed when the client session ends.
Sequential nodes receive an auto‑incremented suffix.
Watchers can monitor data changes or child‑node modifications and receive notifications.
How to Use Zookeeper
Zookeeper solves consistency problems in distributed applications by providing a reliable, hierarchical namespace and a set of watch‑based notifications.
Common API Overview
Clients create a ZooKeeper instance and invoke methods such as create, exists, delete, getChildren, setData, getData, addAuthInfo, and setACL. Watchers can be attached to monitor state changes.
Basic Operations Example
The following code demonstrates connecting to a server, creating nodes, reading data, updating nodes, and cleaning up.
Listing 2. Basic Zookeeper operations
// Create a connection
ZooKeeper zk = new ZooKeeper("localhost:" + CLIENT_PORT,
ClientBase.CONNECTION_TIMEOUT, new Watcher() {
public void process(WatchedEvent event) {
System.out.println("Event triggered: " + event.getType());
}
});
// Create a persistent node
zk.create("/testRootPath", "testRootData".getBytes(), Ids.OPEN_ACL_UNSAFE,
CreateMode.PERSISTENT);
// Create a child node
zk.create("/testRootPath/testChildPathOne", "testChildDataOne".getBytes(),
Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT);
System.out.println(new String(zk.getData("/testRootPath", false, null)));
System.out.println(zk.getChildren("/testRootPath", true));
zk.setData("/testRootPath/testChildPathOne", "modifyChildDataOne".getBytes(), -1);
System.out.println("Node status: [" + zk.exists("/testRootPath", true) + "]");
zk.create("/testRootPath/testChildPathTwo", "testChildDataTwo".getBytes(),
Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT);
System.out.println(new String(zk.getData("/testRootPath/testChildPathTwo", true, null)));
zk.delete("/testRootPath/testChildPathTwo", -1);
zk.delete("/testRootPath/testChildPathOne", -1);
zk.delete("/testRootPath", -1);
zk.close();Sample output shows event notifications and the results of each operation.
Typical Zookeeper Use Cases
Zookeeper’s observer‑based design enables several common distributed patterns.
Unified Naming Service
Provides a hierarchical namespace for unique, human‑readable names, similar to JNDI but more general.
Configuration Management
Store configuration data in znodes; all services watch the node and automatically reload when the data changes.
Figure 2. Configuration management architecture
Group Membership & Leader Election
Servers create EPHEMERAL (or EPHEMERAL_SEQUENTIAL) nodes under a common parent. The node with the smallest sequence number becomes the leader; when it disappears, a new leader is elected automatically.
Figure 3. Group membership structure
Listing 3. Leader election core code
void findLeader() throws InterruptedException {
byte[] leader = null;
try {
leader = zk.getData(root + "/leader", true, null);
} catch (Exception e) { logger.error(e); }
if (leader != null) {
following();
} else {
String newLeader = null;
try {
byte[] localhost = InetAddress.getLocalHost().getAddress();
newLeader = zk.create(root + "/leader", localhost,
ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.EPHEMERAL);
} catch (Exception e) { logger.error(e); }
if (newLeader != null) {
leading();
} else {
mutex.wait();
}
}
}Distributed Locks
Clients create an EPHEMERAL_SEQUENTIAL node and watch the node with the next‑lowest sequence number. When that node disappears, the client acquires the lock.
Listing 4. Lock acquisition code
void getLock() throws KeeperException, InterruptedException {
List<String> list = zk.getChildren(root, false);
String[] nodes = list.toArray(new String[0]);
Arrays.sort(nodes);
if (myZnode.equals(root + "/" + nodes[0])) {
doAction();
} else {
waitForLock(nodes[0]);
}
}
void waitForLock(String lower) throws InterruptedException, KeeperException {
Stat stat = zk.exists(root + "/" + lower, true);
if (stat != null) {
mutex.wait();
} else {
getLock();
}
}Queue Management
Zookeeper can implement both synchronized (barrier) queues and FIFO queues using EPHEMERAL and SEQUENTIAL nodes.
Figure 4. Lock workflow
Listing 5. Synchronized queue code
void addQueue() throws KeeperException, InterruptedException {
zk.exists(root + "/start", true);
zk.create(root + "/" + name, new byte[0], Ids.OPEN_ACL_UNSAFE,
CreateMode.EPHEMERAL_SEQUENTIAL);
synchronized (mutex) {
List<String> list = zk.getChildren(root, false);
if (list.size() < size) {
mutex.wait();
} else {
zk.create(root + "/start", new byte[0], Ids.OPEN_ACL_UNSAFE,
CreateMode.PERSISTENT);
}
}
}Listing 6. Producer code
boolean produce(int i) throws KeeperException, InterruptedException {
ByteBuffer b = ByteBuffer.allocate(4);
b.putInt(i);
byte[] value = b.array();
zk.create(root + "/element", value, ZooDefs.Ids.OPEN_ACL_UNSAFE,
CreateMode.PERSISTENT_SEQUENTIAL);
return true;
}Listing 7. Consumer code
int consume() throws KeeperException, InterruptedException {
int retvalue = -1;
while (true) {
synchronized (mutex) {
List<String> list = zk.getChildren(root, true);
if (list.isEmpty()) {
mutex.wait();
} else {
Integer min = Integer.valueOf(list.get(0).substring(7));
for (String s : list) {
Integer cur = Integer.valueOf(s.substring(7));
if (cur < min) min = cur;
}
byte[] b = zk.getData(root + "/element" + min, false, null);
zk.delete(root + "/element" + min, 0);
ByteBuffer buffer = ByteBuffer.wrap(b);
retvalue = buffer.getInt();
return retvalue;
}
}
}
}Conclusion
Zookeeper is an essential component of the Hadoop ecosystem, providing reliable coordination for services such as NameNode management, HBase master election, and inter‑server state synchronization. This guide covered its core concepts, installation steps, configuration details, Java API usage, and several typical patterns that illustrate how Zookeeper enables robust distributed system design.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITFLY8 Architecture Home
ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
