Master Kafka: From Basics to Multi‑Broker Cluster Setup
This comprehensive guide introduces Apache Kafka's core concepts—topics, partitions, producers, consumers, and APIs—covers common use cases, walks through downloading, installing, and configuring a single‑node broker, demonstrates multi‑broker clustering, and explains how to use Kafka Connect for data import and export.
1. Understanding Kafka
1.1 Kafka Overview
Kafka is a distributed streaming platform. Its website is http://kafka.apache.org/.
Key functions: publish/subscribe record streams, fault‑tolerant persistent storage, and processing of records as they occur.
1.2 Topics and Partitions
A topic groups messages; each topic is split into partitions, which are ordered, immutable logs stored on disk. Each record has a key, value, and timestamp.
1.3 Distribution
Partitions are spread across broker servers. One broker acts as the leader for a partition; followers replicate it. Leader failure triggers a follower to become the new leader.
1.4 Producers and Consumers
1.4.1 Producers
Producers publish records to specific topics and can choose the target partition (e.g., round‑robin).
1.4.2 Consumers
Consumers belong to a consumer group; each group receives a copy of the topic data. Within a group, partitions are load‑balanced among instances.
1.5 Use Cases
Messaging – replaces traditional brokers with higher throughput, built‑in partitioning, replication, and fault tolerance.
Website activity tracking – real‑time pipelines for page views, searches, etc.
Metrics aggregation – central feed for operational statistics.
Log aggregation – replaces systems like Scribe or Flume with lower latency and stronger durability.
Stream processing – Kafka Streams API enables stateful transformations and joins.
Event sourcing – durable log of state changes for applications.
Commit log – external durable log for distributed systems.
2. Kafka Installation
Download the desired version from http://kafka.apache.org/downloads.html (e.g., 2.1.0) and extract it.
[root@host]# wget http://mirrors.shu.edu.cn/apache/kafka/2.1.0/kafka_2.11-2.1.0.tgz
[root@host]# tar -C /data/ -xvf kafka_2.11-2.1.0.tgz
[root@host]# cd /data/kafka_2.11-2.1.0/Configure and start Zookeeper (required for Kafka).
[root@host]# yum -y install java-1.8.0
[root@host]# nohup zookeeper-server-start.sh config/zookeeper.properties &Configure config/server.properties (broker.id, listeners, log.dirs, zookeeper.connect, etc.) and start the broker.
[root@host]# kafka-server-start.sh -daemon config/server.properties3. Simple Operations
3.1 Create a Topic
kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic along3.2 Produce Messages
kafka-console-producer.sh --broker-list localhost:9092 --topic along
>First message
>Second message3.3 Consume Messages
kafka-console-consumer.sh --bootstrap-server localhost:9092 --from-beginning --topic along4. Multi‑Broker Cluster
Copy server.properties to server-1.properties and server-2.properties, change broker.id, listeners, and log.dirs, then start each broker.
[root@host]# kafka-server-start.sh -daemon config/server-1.properties
[root@host]# kafka-server-start.sh -daemon config/server-2.propertiesVerify replication and leader election with kafka-topics.sh --describe.
5. Kafka Connect
Use connect-standalone.sh with a source file connector and a sink file connector to move data between a local file and a Kafka topic.
connect-standalone.sh config/connect-standalone.properties config/connect-file-source.properties config/connect-file-sink.propertiesCheck the sink file and the topic with the console consumer.
6. Additional Resources
For more details, refer to the original blog post.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Raymond Ops
Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
