Big Data 8 min read

Unlocking Kafka: Deep Dive into Architecture, Reliability, and Deployment

This article explains Kafka's core concepts—including topics, partitions, log segmentation, offset management, and acknowledgment levels—while also providing a step‑by‑step guide to deploying Zookeeper, Kafka, Filebeat, and ELK, troubleshooting common issues, and visualizing logs in Kibana.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
Unlocking Kafka: Deep Dive into Architecture, Reliability, and Deployment

Kafka Architecture Overview

In Kafka, messages are categorized by topic ; producers write to topics and consumers read from them. A topic is a logical concept, while a partition is a physical log file storing producer data with offsets.

Each partition is split into multiple segment files, each consisting of an .index file and a .log file. Files are stored in directories named topicName-partitionId. Offsets are used for indexing and data retrieval.

Example: a topic test with three partitions creates directories test-0 , test-1 , test-2 .

Producers receive acknowledgments (acks) from the leader of each partition. Kafka offers three acks levels controlled by request.required.acks:

0 – No acknowledgment, highest throughput, lowest reliability.

1 – Leader acknowledgment only (default).

-1 / all – All in‑sync replicas must acknowledge, highest reliability.

Kafka ensures data consistency using LEO (Log End Offset) and HW (High Watermark). Follower failures are handled by removing them from the ISR set and resynchronizing from HW; leader failures trigger a new leader election and log truncation to HW.

Deployment Guide: Zookeeper, Kafka, Filebeat, ELK

Prepare the environment with nodes for Elasticsearch, Kibana, Logstash, Apache/Nginx/MySQL, Filebeat, and three Zookeeper/Kafka brokers.

node1: 192.168.67.11   elasticsearch  kibana
node2: 192.168.67.12   elasticsearch
apache: 192.168.67.10   logstash  apache/nginx/mysql
Filebeat node: 192.168.67.13   Filebeat
zk‑kfk01: 192.168.67.21   zookeeper, kafka
zk‑kfk02: 192.168.67.22   zookeeper, kafka
zk‑kfk03: 192.168.67.23   zookeeper, kafka

Start services, configure Filebeat to ship logs to Kafka, and create a Logstash pipeline that reads from the httpd topic and writes to Elasticsearch indices httpd_access-* and httpd_error-*.

output.kafka:
  enabled: true
  hosts: ["192.168.67.21:9092","192.168.67.22:9092","192.168.67.23:9092"]
  topic: "httpd"

Run Logstash with the pipeline configuration, adjust --path.data if path conflicts occur, and verify data in Kibana by creating an index pattern httpd_access-*.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

architectureDeploymentKafkaReliabilityELK
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.