Big Data 41 min read

Understanding Kafka's Transition from ZooKeeper to KRaft: Architecture, Installation, Raft Algorithm, and Common Issues

The article explains how Kafka 3.x replaces ZooKeeper with the internal KRaft consensus layer, detailing the Raft‑based metadata storage, step‑by‑step KRaft cluster installation and configuration, and covering related concepts such as leader election, consumer‑group rebalancing, reliability settings, and performance optimizations.

Tencent Cloud Developer

Apr 14, 2022

Understanding Kafka's Transition from ZooKeeper to KRaft: Architecture, Installation, Raft Algorithm, and Common Issues

Kafka 3.0 removes the dependency on ZooKeeper and adopts the internal consensus mechanism KRaft (Kafka Raft). This article introduces the built‑in consensus algorithm, explains how metadata is stored without ZooKeeper, and provides a complete installation and configuration guide for a KRaft‑based Kafka cluster.

1. Kafka Core Components

Producer – sends messages to brokers.

Consumer – pulls messages from brokers.

Consumer Group – a logical subscriber that balances partitions among its members.

Broker – a server that stores topic partitions.

Topic – a logical queue; messages are written to partitions.

Partition – a unit of parallelism and scalability; each partition is ordered.

Replication – each partition has multiple replicas; one replica is the leader, the others are followers.

2. ZooKeeper Metadata (Kafka 2.x)

In the ZooKeeper‑based architecture, the following ZK nodes store critical metadata: /admin – core internal information (e.g., deleted topics). /brokers – broker and topic metadata. /cluster – unique cluster ID and version. /controller – controller election and management. /isr_change_notification – ISR list changes.

…and several other nodes for controller epochs, leader election, etc.

These ZK paths cause operational overhead, network traffic, and strong coupling between Kafka and ZooKeeper.

3. KRaft Architecture (Kafka 3.x)

When ZooKeeper is removed, Kafka stores its metadata in an internal topic @metadata. The controller is now a regular broker that participates in a Raft quorum. Important concepts:

Process.Roles – defines the node role(s): Broker, Controller, or both.

controller.quorum.voters – list of broker IDs that form the Raft quorum.

Metadata is replicated using the Raft algorithm, providing strong consistency without ZooKeeper.

Key benefits include faster controller elections, reduced network overhead, and a single source of truth for metadata.

4. Installation & Configuration (KRaft)

Download and extract Kafka 3.1.0:

[hadoop@bigdata01 soft]$ wget http://archive.apache.org/dist/kafka/3.1.0/kafka_2.12-3.1.0.tgz

[hadoop@bigdata01 soft]$ tar -zxf kafka_2.12-3.1.0.tgz -C /opt/install/

Edit config/kraft/server.properties (or broker.properties) to set the KRaft parameters:

node.id=1

controller.quorum.voters=1@bigdata01:9093

listeners=PLAINTEXT://bigdata01:9092

advertised.listeners=PLAINTEXT://bigdata01:9092

log.dirs=/opt/install/kafka_2.12-3.1.0/kraftlogs

Create the required log directories:

mkdir -p /opt/install/kafka_2.12-3.1.0/kraftlogs

mkdir -p /opt/install/kafka_2.12-3.1.0/topiclogs

Initialize the storage UUID and format the KRaft log:

[hadoop@bigdata01 kafka]$ ./bin/kafka-storage.sh random-uuid

YkJwr6RESgSJv-sxa1R1mA

[hadoop@bigdata01 kafka]$ ./bin/kafka-storage.sh format -t YkJwr6RESgSJv-sxa1R1mA -c ./config/kraft/server.properties

Start the broker:

[hadoop@bigdata01 kafka]$ ./bin/kafka-server-start.sh ./config/kraft/server.properties

Create a topic:

./bin/kafka-topics.sh --create --topic kafka_test --partitions 3 --replication-factor 2 --bootstrap-server bigdata01:9092,bigdata02:9092,bigdata03:9092

Produce and consume messages using the console tools:

[hadoop@bigdata01 kafka]$ bin/kafka-console-producer.sh --bootstrap-server bigdata01:9092,bigdata02:9092,bigdata03:9092 --topic kafka_test

[hadoop@bigdata02 kafka]$ bin/kafka-console-consumer.sh --bootstrap-server bigdata01:9092,bigdata02:9092,bigdata03:9092 --topic kafka_test --from-beginning

5. Viewing KRaft Metadata

KRaft stores metadata in the internal topic @metadata. Two useful tools are: kafka-dump-log.sh --cluster-metadata-decoder – dumps raw metadata logs. kafka-metadata-shell.sh – provides a ZK‑like CLI for metadata inspection.

Example dump command:

bin/kafka-dump-log.sh --cluster-metadata-decoder --skip-record-metadata --files /opt/install/kafka_2.12-3.1.0/topiclogs/__cluster_metadata-0/00000000000000000000.index,/opt/install/kafka_2.12-3.1.0/topiclogs/__cluster_metadata-0/00000000000000000000.log > /opt/metadata.txt

6. Raft Algorithm Overview

Raft ensures consensus among the controller replicas. The algorithm defines three roles:

Leader – receives client requests, appends entries to its log, and replicates them to followers.

Follower – copies entries from the leader and applies committed entries.

Candidate – a temporary role during leader election.

Key phases:

Leader Election – nodes increment their term, vote for themselves, request votes, and become leader when a majority is reached.

Log Replication – the leader sends AppendEntries RPCs; entries are considered committed once a majority of replicas have them.

Safety – a leader can only be elected if its log is at least as up‑to‑date as any candidate’s log.

Raft terms (called terms ) are numbered sequentially; each term has at most one leader. If a leader crashes, a new election starts in the next term.

7. Consumer Groups and Rebalancing

Kafka uses consumer groups to achieve both point‑to‑point and publish‑subscribe models. Partition assignment strategies include:

Range – assigns contiguous partition ranges to consumers (may be unbalanced with many topics).

RoundRobin – distributes partitions evenly across consumers.

Sticky – tries to keep previous assignments stable to reduce rebalancing overhead.

Rebalancing is triggered when the number of consumers, topics, or partitions changes. The coordinator handles JoinGroup, SyncGroup, and Heartbeat requests to compute new assignments.

8. Reliability Settings

To avoid message loss:

Set acks=all so the leader waits for all in‑sync replicas.

Configure retries to a high value.

Use replication.factor ≥ 2 and min.insync.replicas ≥ 2.

Disable unclean.leader.election.enable to prevent a non‑in‑sync replica from becoming leader.

For consumers, disable auto‑commit ( enable.auto.commit=false) and use manual offset commits; set auto.offset.reset=earliest to avoid missing data during rebalance.

9. Performance Reasons

Sequential I/O – Kafka appends to log files, avoiding random disk seeks.

PageCache and zero‑copy – writes go through the OS page cache; reads use sendfile to avoid copying data between kernel and user space.

Batching and compression – producers batch multiple records, and both producers and brokers can compress data to reduce network and storage usage.

This comprehensive guide demonstrates how Kafka 3.x replaces ZooKeeper with KRaft, how to install and configure a KRaft cluster, and how the underlying Raft consensus, consumer group mechanics, and reliability settings work together to provide a high‑performance, fault‑tolerant messaging system.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Zookeeper kafka Installation Raft KRaft

Written by

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.