Master Apache Kafka: Architecture, Setup, and Essential Commands Explained
This article provides a comprehensive overview of Apache Kafka, covering its core concepts, architecture, common use cases, installation steps, and essential command-line operations for managing topics and brokers in production environments and ensuring reliability.
Hello, I am mikechen. Kafka is a core messaging middleware essential for large-scale architectures and high concurrency.
What is Kafka
Apache Kafka is a distributed publish‑subscribe messaging system originally developed by LinkedIn and now an Apache top‑level project. It is written in Scala and Java and is used for log collection and messaging scenarios.
Typical Use Cases
Log collection : Companies can collect logs from various services using Kafka.
Message system : Decouples producers and consumers, acting as a message cache.
User activity tracking : Records web or app user actions such as page views, searches, and clicks.
Operational metrics : Stores monitoring data, alerts, and reports from distributed applications.
Stream processing : Works with Spark Streaming, Storm, etc.
Kafka Principles
Kafka’s operation involves three main components: producers, brokers, and consumers.
Producer : Publishes messages to Kafka.
Consumer : Subscribes to and pulls messages from Kafka.
Broker : Stores messages and serves them to consumers.
The following diagram illustrates the typical broker‑producer‑consumer relationship:
Producers push data to brokers, while consumers pull data from brokers.
Kafka Architecture
The architecture consists of the following components:
Topic : Logical category of messages; each message belongs to a topic.
Partition : Subdivides a topic to increase throughput and load balancing.
Producer : Sends messages to Kafka.
Broker : Kafka server node; a cluster contains one or more brokers.
Consumer : Pulls messages from brokers.
Setting Up a Kafka Cluster
A Kafka cluster typically relies on three components: JDK, Zookeeper, and Kafka itself. Multiple nodes are configured by creating separate configuration files.
1. Download Kafka
wget https://mirror.bit.edu.cn/apache/kafka/2.5.0/kafka_2.13-2.5.0.tgz2. Extract Kafka
tar -xvf kafka_2.13-2.5.0.tgz3. Modify Configuration
Set a unique broker.id for each machine:
broker.id=1
broker.id=2
broker.id=3Configure Zookeeper connection:
zookeeper.connect=zookeeper1:2181,zookeeper2:2181,zookeeper3:21814. Start Kafka
bin/kafka-server-start.sh config/server.propertiesCommon Kafka Commands
Start Kafka service :
bin/kafka-server-start.sh -daemon config/server.propertiesStop Kafka service : ./kafka-server-stop.sh Create a topic :
bin/kafka-topics.sh --create --topic test0 --zookeeper 127.0.0.1:2181List all topics :
bin/kafka-topics.sh --list --zookeeper localhost:9092Describe a topic :
bin/kafka-topics.sh --describe --zookeeper cdh-worker-1:2181/kafkaShow partition and replica info :
bin/kafka-topics.sh --describe --zookeeper 127.0.0.1:2181 --topic test0Delete a topic :
bin/kafka-topics.sh --delete --zookeeper 127.0.0.1:2181 --topic test0Send messages :
./kafka-console-producer.sh --broker-list localhost:9092 --topic testConsume messages from the beginning :
./kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginningConsume latest messages :
./kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic testThe above commands cover the essential operations for managing a Kafka deployment.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Mike Chen's Internet Architecture
Over ten years of BAT architecture experience, shared generously!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
