Operations 10 min read

How to Replace a ZooKeeper Node in a 5‑Node Cluster Without Downtime

This guide details the step‑by‑step process for replacing a faulty ZooKeeper node (myid 5) in a five‑node cluster, covering configuration updates in zoo.cfg, Hadoop’s hdfs‑site.xml, yarn‑site.xml, HBase‑site.xml, and the required service restarts to ensure continuous high‑availability.

dbaplus Community
dbaplus Community
dbaplus Community
How to Replace a ZooKeeper Node in a 5‑Node Cluster Without Downtime

Environment

Production ZooKeeper version 3.4.6 runs as a 5‑node cluster, providing high‑availability for an 8‑node Hadoop and HBase environment.

Problem

The node with myid 5 (IP 10.10.10.30) must be replaced by a new host 10.10.10.37, which already functions as a namenode and meets all ZooKeeper deployment prerequisites.

Key ZooKeeper Concepts

All servers share an identical zoo.cfg. Each server’s data directory (defined by dataDir) contains a myid file holding the server ID. A quorum of more than half of the nodes must be operational, so a 5‑node cluster tolerates up to two failures.

Deploying the New Node

Update zoo.cfg to include the new server line:

tickTime=2000
initLimit=10
syncLimit=5
dataDir=/data/ZooKeeper
clientPort=2181
server.1=10.10.10.33:2888:3888
server.2=10.10.10.34:2888:3888
server.3=10.10.10.35:2888:3888
server.4=10.10.10.36:2888:3888
server.5=10.10.10.37:2888:3888

Shut down the target host (10.10.10.37) before installing ZooKeeper.

Backup the existing configuration: cp zoo.cfg zoo.cfg0420.

Copy the ZooKeeper package to the new host, create the myid file with the value 5 ( echo "5" > /data/ZooKeeper/myid), and edit the IP address in zoo.cfg to the new host.

Distribute the updated zoo.cfg to the remaining four nodes via scp, overwriting the old file.

Updating Hadoop Configuration

Replace the old host in hdfs-site.xml and yarn-site.xml:

# In hdfs-site.xml
<name>ha.ZooKeeper.quorum</name>   <value>host-10-10-10-49:2181,host-10-10-10-50:2181,host-10-10-10-36:2181,host-10-10-10-38:2181,host-10-10-10-30:2181</value>
# Change to
host-10-10-10-30 → host-10-10-10-37
# In yarn-site.xml
<name>yarn.resourcemanager.zk-address</name>   <value>host-10-10-10-49:2181,host-10-10-10-50:2181,host-10-10-10-36:2181,host-10-10-10-38:2181,host-10-10-10-30:2181</value>
# Change to
host-10-10-10-30 → host-10-10-10-37

Updating HBase Configuration

# In HBase-site.xml
<name>HBase.ZooKeeper.quorum</name>   <value>host-10-10-10-49,host-10-10-10-50,host-10-10-10-36,host-10-10-10-38,host-10-10-10-30</value>
# Change to
host-10-10-10-30 → host-10-10-10-37

Service Restart Sequence

After configuration changes, restart services in the following order (stop in reverse order):

Restart ZooKeeper on all nodes: ./zkServer.sh restart Stop HBase cluster: ./stop-HBase.sh Stop Hadoop services:

./stop-yarn.sh && ./stop-dfs.sh
./yarn-daemon.sh stop resourcemanager

Start Hadoop services:

./start-yarn.sh && ./start-dfs.sh
./yarn-daemon.sh start resourcemanager

Start HBase cluster:

./start-HBase.sh

Verification

Use the following commands to confirm that the new ZooKeeper node has received the updated configuration:

Web UI checks (ports may differ):

http://10.10.10.37:8088 – ResourceManager UI

http://10.10.10.37:50070 – NameNode UI

http://10.10.10.37:60010 – HBase Master UI

Encountered Issue

Starting the new ZooKeeper node before restarting the existing four nodes caused an error; the new host could not join the ensemble until the original nodes were restarted.

Key Takeaways

ZooKeeper does not support dynamic node addition; all nodes must be restarted when adding or removing a server.

Modifications to zoo.cfg must be backed up and applied consistently across the cluster.

Hadoop NameNode HA relies on ZooKeeper, so hdfs-site.xml must be updated.

YARN ResourceManager HA also depends on ZooKeeper; update yarn-site.xml.

HBase HA depends on ZooKeeper; update HBase-site.xml.

Other services such as Hive or Flume that use ZooKeeper require similar configuration changes.

If a maintenance window is unavailable, individual services can be restarted selectively to minimize impact.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

high availabilityConfigurationZooKeeperHBaseHadoop
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.