Comprehensive Introduction to Apache Kafka: Concepts, Architecture, Installation, and Usage
This article provides a comprehensive guide to Apache Kafka, covering its core concepts, architecture, key APIs, topics and partitions, deployment steps, multi‑broker clustering, fault tolerance, and data integration using Kafka Connect, with detailed command‑line examples.
Kafka Overview
Kafka is a distributed streaming platform that provides publish/subscribe messaging, fault‑tolerant storage, and real‑time stream processing capabilities.
Core Concepts
Publish and Subscribe – messages are written to and read from topics.
Durable Storage – records are persisted in an append‑only log.
Stream Processing – consumers can process records as they arrive.
Key Terminology
Topic : a logical category of records, possibly spanning multiple partitions.
Partition : an ordered, immutable sequence of records stored as a log file.
Broker : a server that hosts partitions and serves client requests.
Leader / Follower : each partition has one leader handling reads/writes; followers replicate the leader.
Core APIs
Producer API : publish records to topics.
Consumer API : subscribe to topics and read records.
Streams API : build stream processing applications.
Connector API : integrate Kafka with external systems (e.g., databases).
Installation
Download the desired version from kafka.apache.org and extract it.
[root@along ~]# wget http://mirrors.shu.edu.cn/apache/kafka/2.1.0/kafka_2.11-2.1.0.tgz
[root@along ~]# tar -C /data/ -xvf kafka_2.11-2.1.0.tgz
[root@along ~]# cd /data/kafka_2.11-2.1.0/Zookeeper Configuration
Kafka requires Zookeeper for cluster coordination.
[root@along ~]# yum -y install java-1.8.0Modify config/zookeeper.properties as needed (e.g., dataDir, clientPort).
Kafka Broker Configuration
Edit config/server.properties to set broker ID, listeners, log directories, and Zookeeper connection.
broker.id=0
listeners=PLAINTEXT://localhost:9092
log.dirs=/tmp/kafka-logs
zookeeper.connect=localhost:2181Starting Services
Start Zookeeper and then Kafka:
# nohup zookeeper-server-start.sh config/zookeeper.properties &
# service kafka startBasic Operations
Create a topic: # kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic along
Send messages with the console producer: # kafka-console-producer.sh --broker-list localhost:9092 --topic along > This is a message
Consume messages with the console consumer: # kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic along --from-beginning
Multi‑Broker Cluster
Copy server.properties to create server-1.properties and server-2.properties , change broker.id , listeners , and log.dirs , then start each broker.
# nohup kafka-server-start.sh config/server-1.properties &
# nohup kafka-server-start.sh config/server-2.properties &Verify the cluster with kafka-topics.sh --describe and test fault tolerance by killing a broker; the leader will automatically move to another replica.
Kafka Connect
Kafka Connect enables importing/exporting data without custom code. Run in standalone mode with configuration files:
# connect-standalone.sh config/connect-standalone.properties config/connect-file-source.properties config/connect-file-sink.propertiesExample: source connector reads lines from test.txt into topic connect-test ; sink connector writes the topic back to test.sink.txt .
# echo -e "foo\nbar" > test.txt
# cat test.sink.txt
foo
barConsume the topic directly to see the JSON payloads.
# kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic connect-test --from-beginning
{"schema":{"type":"string","optional":false},"payload":"foo"}
{"schema":{"type":"string","optional":false},"payload":"bar"}The article also contains promotional text unrelated to technical content, which has been omitted from the summary.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.