Step‑by‑Step Guide to Deploying a Multi‑Node Kafka Cluster
This tutorial walks through setting up a four‑node Kafka cluster—including Zookeeper installation, broker configuration, service startup, replication settings, fault handling, and leader election—using Linux commands and detailed code snippets to help readers build a production‑ready streaming platform.
Kafka Cluster Overview
This guide shows how to build a three‑node Kafka cluster with a dedicated Zookeeper node, assuming familiarity with basic Kafka concepts.
Cluster Basics
Kafka runs in cluster mode by default; a single broker is also a cluster.
Cluster coordination is handled by Zookeeper, which stores metadata such as broker registrations.
All brokers that register with the same Zookeeper ensemble belong to the same cluster.
Each broker is uniquely identified by broker.id.
Key Roles
Broker : a Kafka server process.
Leader : the replica that handles all produce and consume requests for a partition.
Follower : replicas that copy the leader’s log to provide fault tolerance.
Demo Environment
Four virtual machines (VMs) are used:
Zookeeper Installation
Prerequisite: JDK 11 installed.
# cd /usr/local/src
# wget https://mirror.bit.edu.cn/apache/zookeeper/zookeeper-3.6.1/apache-zookeeper-3.6.1-bin.tar.gz
# tar -zxvf apache-zookeeper-3.6.1-bin.tar.gz
# mv apache-zookeeper-3.6.1-bin ../zookeeperConfigure Zookeeper:
# cd ../zookeeper/conf
# cp zoo_sample.cfg zoo.cfg
# vim zoo.cfg # set dataDir=/data/zookeeper
# mkdir -p /data/zookeeperStart the service and verify it listens on port 2181:
# cd ../bin
# ./zkServer.sh start
# netstat -lntp | grep 2181If a firewall is active, open the port:
# firewall-cmd --zone=public --add-port=2181/tcp --permanent
# firewall-cmd --reloadKafka Installation (v2.5.0)
# cd /usr/local/src
# wget https://mirror.bit.edu.cn/apache/kafka/2.5.0/kafka_2.13-2.5.0.tgz
# tar -xvf kafka_2.13-2.5.0.tgz
# mv kafka_2.13-2.5.0 ../kafkaEdit server.properties for each broker (example for broker 0):
# cd ../kafka/config
# vim server.properties
broker.id=0
listeners=PLAINTEXT://192.168.99.1:9092
advertised.listeners=PLAINTEXT://192.168.99.1:9092
log.dirs=/usr/local/kafka/kafka-logs
zookeeper.connect=192.168.99.4:2181Add Kafka binaries to the system PATH:
# vim /etc/profile
export KAFKA_HOME=/usr/local/kafka
export PATH=$PATH:$KAFKA_HOME/bin
# source /etc/profileStart the broker and confirm it listens on port 9092:
# kafka-server-start.sh /usr/local/kafka/config/server.properties &
# netstat -lntp | grep 9092If a firewall is enabled, open the port:
# firewall-cmd --zone=public --add-port=9092/tcp --permanent
# firewall-cmd --reloadDeploy the same Kafka installation to the other two broker VMs and adjust broker.id and listeners accordingly:
# rsync -av /usr/local/kafka 192.168.99.2:/usr/local/kafka
# rsync -av /usr/local/kafka 192.168.99.3:/usr/local/kafka
# vim /usr/local/kafka/config/server.properties # set broker.id=1, listeners=PLAINTEXT://192.168.99.2:9092
# vim /usr/local/kafka/config/server.properties # set broker.id=2, listeners=PLAINTEXT://192.168.99.3:9092Start the additional brokers and verify cluster membership via Zookeeper:
# /usr/local/zookeeper/bin/zkCli.sh
[zk: localhost:2181(CONNECTED) 4] ls /brokers/ids
[0, 1, 2]Replication Mechanics
Kafka replicates each partition’s log to multiple brokers for durability. The replication factor determines how many copies exist. Kafka assigns partitions and replicas using a simple modulo algorithm:
Sort the N brokers and the i ‑th partition.
Primary replica (leader) for partition i is placed on broker (i mod N).
Replica j of partition i is placed on broker ((i + j) mod N).
Broker Failure Handling
A broker is marked failed if it loses its heartbeat with Zookeeper.
A follower that falls too far behind the leader is also considered failed.
When a broker fails, Kafka removes it from the ISR (in‑sync replica) set, rebalances partitions, and continues serving requests using the remaining replicas, thereby avoiding data loss.
Leader Election
Kafka does not use a voting protocol. It maintains an ISR list for each partition and selects the fastest replica in the ISR as the leader. If all ISR replicas die, Kafka can perform an “unclean” leader election, which may cause data loss.
Recommended configuration to protect data integrity:
Disable unclean leader election ( unclean.leader.election.enable=false).
Set a minimum number of in‑sync replicas, e.g., min.insync.replicas=2.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
IT Architects Alliance
Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
