Why Kafka 2.8 Is Dropping Zookeeper: Architecture, Challenges, and the Path to KIP‑500
The article explains how Kafka 2.8 removes its dependency on Zookeeper, detailing Kafka's core concepts, the role of Zookeeper in broker registration, load balancing, and controller election, the operational drawbacks of this coupling, and how KIP‑500 with a Quorum Controller modernizes the architecture.
Kafka Overview
Apache Kafka is a distributed streaming platform originally developed by LinkedIn and now an Apache project. It provides high‑throughput, durable, horizontally scalable message queuing, distributed storage and real‑time processing via Kafka Streams and Kafka Connect.
Key concepts:
Producer – publishes records to a topic.
Consumer – reads records from a topic.
Consumer group – a set of consumers that jointly consume partitions of a topic, each partition assigned to a single consumer instance.
Broker – a Kafka server that stores partitions and serves client requests.
Topic – a logical stream of records.
Partition – an ordered, immutable sequence of records within a topic; each partition is replicated across brokers and identified by an offset.
Interaction with ZooKeeper (pre‑KIP‑500)
Before KIP‑500, Kafka used ZooKeeper to store cluster metadata. Important ZooKeeper znodes include: /brokers/ids – temporary nodes created by each broker on startup; contain broker ID, host and port. Nodes disappear if the broker crashes. /brokers/topics/[topic]/partitions/[partition]/state – stores the leader, ISR list and version for each partition. /consumers/[group_id] – stores consumer‑group offsets and partition assignments.
Example command (using the ZooKeeper CLI) to retrieve the state of partition 1 of topic xxx:
get /brokers/topics/xxx/partitions/1/state
{"controller_epoch":15,"leader":11,"version":1,"leader_epoch":2,"isr":[11,12,13]}Producers watch /brokers/ids to discover the current broker list, while consumer groups read partition metadata to perform load balancing.
Controller Role
One broker is elected as the Controller. The Controller watches the temporary /controller znode, reads the full metadata from ZooKeeper, and manages:
Partition leader election and ISR updates.
Topic creation, deletion and partition reassignment.
Broker registration and removal.
Propagation of metadata changes to all brokers.
If the current Controller fails, ZooKeeper elects a new Controller, which reloads metadata and notifies all brokers. During this transition the cluster cannot serve client requests.
Problems Caused by ZooKeeper
Operational complexity : Deploying and operating a separate ZooKeeper ensemble doubles the operational surface.
Controller failover latency : Election of a new Controller and metadata reload can be time‑consuming, causing temporary service outage.
Metadata bottleneck : As the number of topics and partitions grows, the amount of metadata stored in ZooKeeper increases, leading to higher read/write latency and affecting Kafka performance.
KIP‑500 and the Quorum Controller
KIP‑500 replaces the ZooKeeper‑based metadata store with a built‑in Raft‑based quorum controller (KRaft). Each controller node stores the complete metadata and achieves consensus via the Raft protocol, eliminating the single point of failure.
Key benefits:
Supports millions of partitions after migration.
Removes the need for a separate ZooKeeper deployment, simplifying operations.
Provides fast failover because leader election is handled by Raft.
Kafka 2.8 includes the KIP‑500 code in the trunk branch; Kafka 3.0 can run with either the legacy ZooKeeper controller or the new Quorum controller, enabling gradual migration.
Conclusion
In large‑scale and cloud‑native deployments, ZooKeeper adds significant operational overhead and becomes a performance bottleneck for metadata access. Migrating to the KRaft‑based Quorum controller removes this dependency, aligns with a simpler architecture, and prepares Kafka for future scalability requirements.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
