Why Kafka 2.8 Is Dropping Zookeeper and What It Means for You
This article explains how Kafka 2.8 removes its dependency on Zookeeper, describes the roles of brokers, topics, partitions, and the controller in the Zookeeper‑based architecture, and outlines the KIP‑500 upgrade that replaces Zookeeper with a quorum‑based KRaft controller to improve scalability and operational simplicity.
1. Kafka Overview
Apache Kafka, originally developed by LinkedIn and later donated to the Apache Foundation, is a distributed streaming platform known for high throughput, persistence, and horizontal scalability. Its core functions include a message queue, distributed storage, and real‑time data processing via Kafka Streams and Kafka Connect.
2. Kafka and Zookeeper Relationship
2.1 Registration Center
2.1.1 Broker registration
When a broker starts, it registers itself in Zookeeper under /brokers/ids, storing its IP address and port. Zookeeper creates a temporary node that is automatically removed if the broker crashes.
2.1.2 Topic registration
Zookeeper records each topic under /brokers/topics/[topic_name] and stores the mapping between partitions and brokers.
2.1.3 Consumer registration
Consumer groups register under /consumers/{group_id}, allowing Zookeeper to track partition offsets and the relationship between partitions and consumers.
2.2 Load balancing
Producers discover broker list changes via Zookeeper, enabling dynamic load balancing. Consumer groups use topic node information to pull messages from specific partitions.
3. Controller Introduction
The controller, elected among brokers, interacts with Zookeeper to manage the state of partitions and replicas, listen to metadata changes, and synchronize updates across all brokers.
Monitor partition changes
Monitor topic changes
Monitor broker changes
Cluster metadata management
The controller processes Zookeeper events, timer tasks, and other events via a LinkedBlockingQueue, updating its metadata cache and notifying brokers as needed.
4. Problems Introduced by Zookeeper
4.1 Operational complexity
Running Kafka with Zookeeper requires deploying and operating two distributed systems, increasing the operational burden on administrators.
4.2 Controller failure handling
If the controller broker fails, a new broker is elected, pulls metadata from Zookeeper, and notifies all other brokers. During this transition the Kafka cluster is unavailable.
4.3 Partition bottleneck
As the number of partitions grows, Zookeeper stores more metadata, increasing load and latency, which can limit Kafka’s scalability in large‑scale or cloud‑native deployments.
5. Upgrade Path (KIP‑500)
KIP‑500 replaces Zookeeper with a quorum‑based controller using the KRaft protocol. Each controller node stores the full metadata, and the Raft algorithm ensures consistency, allowing Kafka to run without Zookeeper and support millions of partitions.
[root@master] get /brokers/topics/xxx/partitions/1/state
{"controller_epoch":15,"leader":11,"version":1,"leader_epoch":2,"isr":[11,12,13]}6. Summary
Removing Zookeeper reduces operational complexity and eliminates a scalability bottleneck, aligning Kafka with cloud‑native, simplified architecture principles.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Su San Talks Tech
Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
