How to Migrate Hundreds of ZooKeeper Instances to Kubernetes Without Downtime
This guide details a zero‑downtime migration of large ZooKeeper clusters to Kubernetes by wrapping each server in a ClusterIP service, reconfiguring clients, updating server configs, and replacing physical nodes with Pods, while outlining required network prerequisites and practical steps.
Traditional ZooKeeper Migration Method
ZooKeeper clusters are defined by a static configuration file that lists each server, e.g. server.1=host1:2888:3888, server.2=host2:2888:3888, server.3=host3:2888:3888, and a myid file on each host containing its numeric identifier. Adding or removing nodes requires editing the configuration on every server, updating client connection strings, and performing a rolling restart because ZooKeeper 3.4.x does not support dynamic reconfiguration.
Start a new host and add an entry such as server.4=host4:2888:3888 to the configuration.
Update the configuration files on all existing hosts to add or remove entries.
Perform a rolling restart of the old hosts.
Update client connection strings to reflect the new server list.
This manual process is error‑prone and makes reliable automation difficult; each restart also risks a leader election failure.
Migration Method Using Kubernetes
Wrap each ZooKeeper server in a Kubernetes ClusterIP service and replace the physical server with a Pod that uses the same ZooKeeper ID. Only a single rolling restart of the original servers is required before they are swapped out for Pods.
The migration consists of five high‑level steps:
Prepare the environment.
Create a ClusterIP service for each ZooKeeper server.
Reconfigure ZooKeeper clients to connect to the ClusterIP services.
Update ZooKeeper server configuration to use the service DNS names for quorum communication.
Run the ZooKeeper instances inside Kubernetes Pods.
1. Prepare Prerequisites
Start with a running ZooKeeper cluster and ensure that the hosts can reach the Kubernetes cluster (e.g., via an internal CNI plugin). The diagram below shows a two‑node ZooKeeper cluster before migration.
2. Create ClusterIP Services
For each ZooKeeper server, create a matching ClusterIP service that forwards client port 2181 and internal quorum ports 2888 and 3888. The service name becomes the DNS name used by clients and other ZooKeeper nodes.
3. Reconfigure Clients
Update client connection strings to point to the new ClusterIP DNS names. If CNAME records are used, modify the DNS entries; otherwise replace the connection string in the client configuration and restart the client processes.
4. Reconfigure ZooKeeper Instances
Modify each server’s zoo.cfg to reference the ClusterIP service names, for example:
server.1=zk1-kube-svc-0:2888:3888
server.2=zk2-kube-svc-1:2888:3888
server.3=zk3-kube-svc-2:2888:3888
zk_quorum_listen_all_ips=trueSet zk_quorum_listen_all_ips=true so that the instance can bind to the service IP, which does not exist on the host interface. After a rolling restart of the physical hosts, the cluster is ready for Pod replacement.
5. Replace Hosts with Pods
For each server perform the following sub‑steps:
Select a ZooKeeper server and its corresponding ClusterIP service.
Stop the ZooKeeper process on the physical host.
Launch a Pod that uses the same myid file and server list.
Wait until the Pod starts, joins the quorum, and synchronizes data.
When all hosts have been swapped, the entire cluster runs inside Kubernetes while preserving all data.
6. Network Prerequisites
The migration requires the following network conditions:
All servers that need to reach ZooKeeper must be able to route to the Pod IPs.
Servers must resolve Kubernetes service DNS names.
Servers must run kube-proxy (or an equivalent CNI) to access ClusterIP services.
These conditions can be satisfied with an internal CNI plugin such as the Amazon VPC CNI ( https://github.com/aws/amazon-vpc-cni-k8s) or the Lyft IPVLAN VPC plugin ( https://github.com/lyft/cni-ipvlan-vpc-k8s). Directly assigning VPC IPs to Pods or using an overlay network (e.g., Flannel) also works as long as routing is consistent.
References
ZooKeeper to Kubernetes Migration Blog: https://product.hubspot.com/blog/zookeeper-to-kubernetes-migration
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
