Cloud Native 10 min read

How to Migrate Hundreds of ZooKeeper Instances to Kubernetes Without Downtime

This guide details a zero‑downtime migration of large ZooKeeper clusters to Kubernetes by wrapping each server in a ClusterIP service, reconfiguring clients, updating server configs, and replacing physical nodes with Pods, while outlining required network prerequisites and practical steps.

dbaplus Community
dbaplus Community
dbaplus Community
How to Migrate Hundreds of ZooKeeper Instances to Kubernetes Without Downtime

Traditional ZooKeeper Migration Method

ZooKeeper clusters are defined by a static configuration file that lists each server, e.g. server.1=host1:2888:3888, server.2=host2:2888:3888, server.3=host3:2888:3888, and a myid file on each host containing its numeric identifier. Adding or removing nodes requires editing the configuration on every server, updating client connection strings, and performing a rolling restart because ZooKeeper 3.4.x does not support dynamic reconfiguration.

Start a new host and add an entry such as server.4=host4:2888:3888 to the configuration.

Update the configuration files on all existing hosts to add or remove entries.

Perform a rolling restart of the old hosts.

Update client connection strings to reflect the new server list.

This manual process is error‑prone and makes reliable automation difficult; each restart also risks a leader election failure.

Migration Method Using Kubernetes

Wrap each ZooKeeper server in a Kubernetes ClusterIP service and replace the physical server with a Pod that uses the same ZooKeeper ID. Only a single rolling restart of the original servers is required before they are swapped out for Pods.

The migration consists of five high‑level steps:

Prepare the environment.

Create a ClusterIP service for each ZooKeeper server.

Reconfigure ZooKeeper clients to connect to the ClusterIP services.

Update ZooKeeper server configuration to use the service DNS names for quorum communication.

Run the ZooKeeper instances inside Kubernetes Pods.

1. Prepare Prerequisites

Start with a running ZooKeeper cluster and ensure that the hosts can reach the Kubernetes cluster (e.g., via an internal CNI plugin). The diagram below shows a two‑node ZooKeeper cluster before migration.

Initial two‑node ZooKeeper cluster diagram
Initial two‑node ZooKeeper cluster diagram

2. Create ClusterIP Services

For each ZooKeeper server, create a matching ClusterIP service that forwards client port 2181 and internal quorum ports 2888 and 3888. The service name becomes the DNS name used by clients and other ZooKeeper nodes.

ClusterIP service access diagram
ClusterIP service access diagram

3. Reconfigure Clients

Update client connection strings to point to the new ClusterIP DNS names. If CNAME records are used, modify the DNS entries; otherwise replace the connection string in the client configuration and restart the client processes.

Clients communicating via ClusterIP services
Clients communicating via ClusterIP services

4. Reconfigure ZooKeeper Instances

Modify each server’s zoo.cfg to reference the ClusterIP service names, for example:

server.1=zk1-kube-svc-0:2888:3888
server.2=zk2-kube-svc-1:2888:3888
server.3=zk3-kube-svc-2:2888:3888
zk_quorum_listen_all_ips=true

Set zk_quorum_listen_all_ips=true so that the instance can bind to the service IP, which does not exist on the host interface. After a rolling restart of the physical hosts, the cluster is ready for Pod replacement.

ZooKeeper instances communicating via ClusterIP
ZooKeeper instances communicating via ClusterIP

5. Replace Hosts with Pods

For each server perform the following sub‑steps:

Select a ZooKeeper server and its corresponding ClusterIP service.

Stop the ZooKeeper process on the physical host.

Launch a Pod that uses the same myid file and server list.

Wait until the Pod starts, joins the quorum, and synchronizes data.

When all hosts have been swapped, the entire cluster runs inside Kubernetes while preserving all data.

Final cluster after Pod replacement
Final cluster after Pod replacement

6. Network Prerequisites

The migration requires the following network conditions:

All servers that need to reach ZooKeeper must be able to route to the Pod IPs.

Servers must resolve Kubernetes service DNS names.

Servers must run kube-proxy (or an equivalent CNI) to access ClusterIP services.

These conditions can be satisfied with an internal CNI plugin such as the Amazon VPC CNI ( https://github.com/aws/amazon-vpc-cni-k8s) or the Lyft IPVLAN VPC plugin ( https://github.com/lyft/cni-ipvlan-vpc-k8s). Directly assigning VPC IPs to Pods or using an overlay network (e.g., Flannel) also works as long as routing is consistent.

References

ZooKeeper to Kubernetes Migration Blog: https://product.hubspot.com/blog/zookeeper-to-kubernetes-migration

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Cloud NativeKubernetesClusterIP
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.