Big Data 18 min read

Kafka Architecture Overview: Producers, Consumers, Partitions, Replication, and Transactions

This article provides a comprehensive overview of Apache Kafka's architecture, covering topics such as producer and consumer workflows, partition and replica management, leader election, offset handling, message delivery semantics, transaction support, and file organization, illustrating how Kafka achieves high performance and scalability.

Architect's Tech Stack
Architect's Tech Stack
Architect's Tech Stack
Kafka Architecture Overview: Producers, Consumers, Partitions, Replication, and Transactions

Introduction

Kafka is a distributed message queue offering high performance, persistence, replication, and horizontal scalability. Producers write messages to topics, consumers read from topics, and the system is used for decoupling, throttling, and asynchronous processing in architecture design.

Data Flow

Topics consist of multiple partitions; each partition can be replicated across brokers. Producers send messages to a specific topic and partition, while consumers read from assigned partitions. Partition leaders handle all writes, and followers replicate data.

Production Process

To produce a record, the client specifies the target topic, value, optional key, and partition. Records are serialized and batched before being sent. If no partition is provided, the producer uses either key‑based hashing or round‑robin selection.

Producer API

Kafka provides a high‑level API that abstracts offset management and routing, as well as a simple API where the user must handle offsets manually.

Partition Management

Each partition has a leader broker; all client requests for that partition go through the leader and are then synchronized to followers. A controller (selected via ZooKeeper) is responsible for partition assignment and leader election.

Partition Assignment Algorithm

1. Sort all brokers and partitions. 2. Assign partition i to broker (i mod n) as leader. 3. Assign replica j of partition i to broker ((i + j) mod n).

Leader Failover

The controller watches ZooKeeper for broker failures. When a broker goes down, the controller selects a new leader from the in‑sync replica (ISR) list and updates ZooKeeper, then notifies affected brokers.

Replication and Acknowledgments

Followers pull batches from the leader. Reliability is controlled by the producer’s request.required.acks setting. If acks=-1 and the ISR size falls below min.insync.replicas , the write is rejected.

Consumer Model

Consumers belong to consumer groups; each partition is consumed by only one member of the group, though multiple groups can read the same partition. Offsets were historically stored in ZooKeeper but are now kept in an internal __consumer_offsets topic.

__consumers_offsets partition = Math.abs(groupId.hashCode() % groupMetadataTopicPartitionCount)
//groupMetadataTopicPartitionCount is set by offsets.topic.num.partitions (default 50)

Consumer Group Coordination

A coordinator (the leader of the offset‑storage partition) handles partition assignment and rebalancing. Consumers request metadata, send heartbeats, and participate in JoinGroup/SyncGroup protocols to receive partition assignments.

Rebalance Triggers

Rebalancing occurs when partitions are added, consumers join or leave, or when the coordinator fails.

Message Delivery Semantics

Kafka supports three delivery guarantees:

At most once – messages may be lost but never duplicated.

At least once – messages are never lost but may be duplicated.

Exactly once – no loss and no duplication, available from version 0.11 when downstream is also Kafka.

Transactions

Transactions enable atomic writes to multiple topics and coordinated offset commits. A transaction ID (tid) identifies the transaction, while a producer ID (pid) identifies the producer. A transaction coordinator records transaction state in a compacted log.

Transactional Flow

1. Begin transaction (BEGIN). 2. Write records to target partitions. 3. Record transaction state (PREPARE COMMIT or PREPARE ABORT). 4. Write marker messages to involved partitions. 5. Commit or abort the transaction, updating the coordinator log.

File Organization

Kafka stores data as log segments on the filesystem. Each topic has partitions, each partition contains segment files named by the smallest offset in the segment. Index files (.index) provide sparse offset‑to‑position mappings for fast lookup.

Common Configuration

Broker and topic configurations control replication factor, log retention policies (time‑based or size‑based), and cleanup strategies (delete or compact). Offsets are stored in a compacted topic, and log cleanup respects the latest message timestamp.

distributed systemsbig datatransactionKafkaMessage QueueConsumerproducer
Architect's Tech Stack
Written by

Architect's Tech Stack

Java backend, microservices, distributed systems, containerized programming, and more.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.