Big Data 8 min read

Explanation of Kafka Components and Architecture

This article provides a comprehensive overview of Kafka’s core components—including brokers, topics, partitions, producers, and consumers—explaining their roles, log structures, replication mechanisms, and the system’s architecture, supplemented with practical visual illustrative diagrams.

IT Architects Alliance
IT Architects Alliance
IT Architects Alliance
Explanation of Kafka Components and Architecture

1. Broker

Each Kafka server is called a broker; multiple brokers together form a Kafka cluster. A single machine can host one or more brokers, all of which connect to the same ZooKeeper to constitute the cluster.

2. Topic

Kafka is a publish‑subscribe messaging system. A topic represents a category of messages; each topic typically holds one class of messages and has one or more subscribers (consumers). Producers publish messages to a topic, and consumers pull messages from the topic they subscribe to.

3. Topic and Broker

A broker can host one or many topics, and the same topic can be distributed across multiple brokers within the same cluster.

4. Partition Log

Each topic is divided into multiple partitions. Every partition maps to a logical log file that is append‑only. When a message is published to a specific partition, the broker appends it to the last segment of that log and flushes segments to disk based on time or size thresholds.

Each message appended to a partition receives a unique offset; offsets are unique within the partition.

Partitions retain all published records for a configurable retention period (e.g., 2 days), regardless of consumption status.

5. Partition Distribution

Partitions are distributed across brokers for fault tolerance. Each partition has multiple replicas on different brokers; one replica acts as the leader handling read/write requests, while the others are followers that replicate data asynchronously.

Example leader/follower assignments for several topics are illustrated, showing how each partition’s leader may reside on a different broker to balance load.

6. Producer

The producer is the message source. After creating a message, it sends the message to a specific topic and partition, either by using a partition‑selection algorithm or randomly.

7. Consumer

Consumers are organized into consumer groups; each group may consist of processes on different machines.

Each message in a topic can be consumed by multiple consumer groups, but only one consumer within a group processes a given message.

Consumers can subscribe to multiple topics.

Offsets are stored in ZooKeeper to track each consumer’s progress.

Architecture Diagram

The following diagram visualizes the relationships among brokers, topics, partitions, producers, and consumers.

Since version 0.8, consumers no longer communicate directly with ZooKeeper; the architecture diagram has been updated accordingly.

distributed systemsKafkaReplicationMessagingConsumerproducerpartitioning
IT Architects Alliance
Written by

IT Architects Alliance

Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.