Kafka Architecture, Core Concepts, and Operational Best Practices

This article provides a comprehensive overview of Kafka's architecture, core concepts, high‑throughput design, replication, network model, capacity planning, producer and consumer tuning, custom partitioning, rebalance strategies, broker management, and operational tools for building and maintaining robust distributed messaging systems.

IT Architects Alliance
IT Architects Alliance
IT Architects Alliance
Kafka Architecture, Core Concepts, and Operational Best Practices

Kafka is a high‑throughput, distributed messaging system that decouples services, enables asynchronous processing, and controls traffic spikes such as flash‑sale events.

Core concepts include producers, consumers, topics, partitions, consumer groups, and the controller node managed via ZooKeeper.

In a Kafka cluster each broker stores partitions as directories on disk; a topic’s log is split into 1 GB segments, and Kafka uses sequential disk writes and zero‑copy (sendfile) to achieve high write and read performance.

Log indexing employs sparse indexes with binary search to locate messages quickly, while replication provides high availability through leader‑follower pairs and ISR lists.

Network design follows a reactor pattern with multiple selectors, threads and queues, and can leverage 10 GbE NICs for extreme concurrency.

Production‑grade deployment requires capacity planning: estimating request volume, storage (e.g., 10 billion daily requests ≈ 276 TB with 2‑replica factor), number of physical servers, SSD vs. HDD choices, memory for OS cache (≈ 60 GB), and CPU cores (≥ 16).

Producer tuning parameters such as buffer.memory, compression.type, batch.size, linger.ms and ACK settings affect throughput and durability.

Consumer offset management moved from ZooKeeper to the internal __consumer_offsets topic, with configurable commit intervals and offset reset policies.

Custom partitioners can be implemented in Java, for example the HotDataPartitioner shown below, and registered via

props.put(\"partitioner.class\", \"com.zhss.HotDataPartitioner\")

.

public class HotDataPartitioner implements Partitioner { private Random random; @Override public void configure(Map configs) { random = new Random(); } @Override public int partition(String topic, Object keyObj, byte[] keyBytes, Object value, byte[] valueBytes, Cluster cluster) { String key = (String) keyObj; List<PartitionInfo> partitionInfoList = cluster.availablePartitionsForTopic(topic); int partitionCount = partitionInfoList.size(); int hotDataPartition = partitionCount - 1; return !key.contains(\"hot_data\") ? random.nextInt(partitionCount - 1) : hotDataPartition; } }

Rebalance strategies (range, round‑robin, sticky) determine how partitions are assigned to consumers; the group coordinator handles join, sync and rebalance cycles.

Broker management includes tracking Log End Offset (LEO) and High Watermark (HW), controller election, delayed operations, and a time‑wheel scheduler for O(1) task insertion.

Operational tools such as Kafka‑Manager and command‑line utilities ( kafka-topics.sh, kafka-reassign-partitions.sh) assist with topic creation, partition reassignment, and load balancing.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performanceKafka
IT Architects Alliance
Written by

IT Architects Alliance

Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.