Big Data 17 min read

Understanding Kafka: Architecture, Topics, Partitions, Producers, Consumers, Offsets, Transactions, and Configuration

This article provides a comprehensive overview of Apache Kafka, explaining its distributed message‑queue architecture, the role of topics and partitions, producer and consumer workflows, leader election, offset management, consumer‑group rebalancing, delivery semantics, transaction processing, file organization, and key configuration settings.

IT Architects Alliance
IT Architects Alliance
IT Architects Alliance
Understanding Kafka: Architecture, Topics, Partitions, Producers, Consumers, Offsets, Transactions, and Configuration

Kafka is a distributed message queue with high performance, persistence, replication, and horizontal scalability; producers write messages to topics, and consumers read them for business logic, enabling decoupling, throttling, and asynchronous processing.

Each topic consists of multiple partitions, allowing horizontal scaling; messages within a partition are ordered, and new messages are appended to files for high write throughput.

Producers send records specifying topic, value, optional key and partition; if no partition is provided, the key (if present) is hashed to select a partition, otherwise round‑robin is used. Requests are batched before being sent.

Kafka uses ZooKeeper to store metadata about brokers, topics, and partitions, and to elect a Controller responsible for partition assignment and leader election.

Partition Assignment:

Sort all brokers and partitions.

Assign partition i to broker (i mod n) as leader.

Assign replica j of partition i to broker ((i + j) mod n).

Leader Election and Failover:

Each partition has a leader that handles all read/write requests; followers replicate from the leader. When a broker fails, the Controller re‑elects leaders for affected partitions using the ISR (in‑sync replica) list stored in ZooKeeper.

Consumer Groups and Rebalancing:

Consumers subscribe to topics as part of a consumer group; each partition is consumed by only one consumer within the group, but can be consumed by multiple groups. The group coordinator (selected based on the offset‑storage partition leader) handles partition assignment and rebalancing when partitions or consumers change.

Consumer sends JoinGroupRequest to coordinator.

Coordinator selects a group leader to compute the assignment.

Leader sends SyncGroupResponse with the assignment.

Offset Storage:

Offsets were originally stored in ZooKeeper; since Kafka 0.10 they are stored in an internal consumer_offsets topic with compacted cleanup, using keys composed of groupId, topic, and partition.

Partition for a given group is calculated as:

__consumers_offsets partition = Math.abs(groupId.hashCode() % groupMetadataTopicPartitionCount) // groupMetadataTopicPartitionCount defaults to 50

Delivery Semantics:

At most once – messages may be lost but never duplicated.

At least once – messages are never lost but may be duplicated.

Exactly once – no loss and no duplication (available from Kafka 0.11 when downstream is also Kafka).

Transactions:

Kafka provides transactional guarantees by using a transaction coordinator, transaction IDs (tid), and marker messages (prepare‑commit, commit, abort). Producers write to multiple topics and offsets within a transaction, and only after a commit marker is written are the messages visible to consumers.

File Organization:

Data is stored as log segments under /partitionId/ directories, each segment file named by its base offset with a .log extension and accompanied by .index files for offset and time lookup. Sparse indexes map base offsets to file positions, enabling efficient binary search.

Common Configuration:

Broker settings (e.g., replication factor, log retention).

Topic settings (e.g., number of partitions, cleanup policy).

Log cleanup respects both size‑based and time‑based policies, with special handling to avoid deleting the active segment.

big dataKafkaTransactionsDistributed Messagingpartitionsconsumer-groups
IT Architects Alliance
Written by

IT Architects Alliance

Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.