Big Data 18 min read

Kafka Architecture Overview and Core Concepts

Kafka’s architecture consists of brokers forming clusters, producers publishing to topics split into partitions with replicas, consumers organized in groups pulling messages by offset, ZooKeeper managing metadata, and log‑based storage using append‑only files, indexes, and zero‑copy, while configurable acknowledgment, batching, and replication ensure high throughput and fault‑tolerant reliability.

Tencent Cloud Developer
Tencent Cloud Developer
Tencent Cloud Developer
Kafka Architecture Overview and Core Concepts

Kafka is used for system decoupling, traffic smoothing, buffering, and asynchronous communication. It is suitable for activity tracking, message passing, metrics, logging, and stream processing. This article introduces the basic concepts of Kafka.

Overall Structure

The diagram below shows many details of Kafka (the image is omitted for brevity). The main components are:

Broker – a Kafka instance; multiple brokers form a Kafka cluster.

Producer – the client that writes messages to a broker.

Consumer – the client that reads messages from a broker.

Consumer Group – a group of consumers that can share the same topic without interfering with each other.

ZooKeeper – used to manage cluster metadata and controller election.

Topic – a logical classification of messages.

Partition – a topic is split into multiple partitions for scalability and reliability.

Replica – each partition can have multiple replicas for fault tolerance.

Leader and Follower – the leader handles reads/writes; followers replicate the leader.

Offset – a unique position of a message within a partition.

Logical Storage of Messages

Kafka stores data as an append‑only log on disk. Each message belongs to a topic and a partition; the offset uniquely identifies its position. Offsets guarantee order only within a partition.

Key points:

Write performance is high because data is only appended.

Read performance is improved by additional mechanisms such as indexing.

How to Write Data

The producer workflow is:

Create a KafkaProducer and a message.

Producer interceptors may filter or modify the message.

Serializer converts the message to a byte array.

Partitioner selects the target partition and stores the record in the RecordAccumulator.

The sender thread builds a request.

If there are many requests, they are batched (ProducerBatch) according to batch.size and linger.ms .

The request is sent to the appropriate broker.

Response is received and resources are cleaned up.

Important producer parameters include buffer.memory , batch.size , linger.ms , max.block.ms , max.in.flight.requests.per.connection , and max.request.size .

Sending modes:

Fire‑and‑forget (highest throughput, lowest reliability).

Synchronous (wait for broker acknowledgment, highest reliability).

Asynchronous with callback (intermediate trade‑off).

The acks setting controls reliability:

acks=1 – only the leader’s write is required (default).

acks=0 – no acknowledgment (possible data loss).

acks=-1 or acks=all – all in‑sync replicas must acknowledge (strongest durability).

How to Read Messages

Consumers use a pull‑based model:

while (true) {
    records := consumer.Pull()
    for record := range records {
        // process record
    }
}

After processing, consumers commit their offset so that the next poll starts from the correct position. Offsets can be committed automatically (default every 5 seconds) or manually. Automatic commits may cause duplicate processing or data loss if a consumer crashes before committing.

Consumer group partition assignment strategies are configurable via partition.assignment.strategy :

Range – divides partitions among consumers based on partition ranges.

RoundRobin – assigns partitions to consumers in a round‑robin fashion.

Sticky – tries to keep previous assignments while balancing load.

Rebalancing occurs when consumers join/leave, topics change, or the group coordinator changes. The steps are FindCoordinator → JoinGroup → SyncGroup → Heartbeat.

Physical Storage

Kafka writes to log files that are split into log segments. Two cleanup policies exist:

Log Retention – deletes segments based on age or size.

Log Compaction – retains only the latest value for each key.

Indexes:

Offset index – a sparse mapping from message offset to file position.

Timestamp index – maps timestamps to offsets, enabling time‑based lookups.

Zero‑copy is used to transfer data from disk to the network without copying between kernel and user space, improving throughput.

Reliability Mechanisms

Key concepts:

AR (Assigned Replicas) – all replicas for a partition.

ISR (In‑Sync Replicas) – replicas that are up‑to‑date with the leader.

OSR (Out‑of‑Sync Replicas) – replicas lagging behind.

LEO (Log End Offset) – the offset of the next message to be written.

HW (High Watermark) – the smallest LEO among ISR; messages up to HW are visible to consumers.

Leader epoch is a version number that increments when the leader changes, preventing followers from truncating logs after a leader change and thus avoiding data loss.

The article provides a high‑level overview of Kafka’s architecture, storage, producer/consumer workflows, and reliability features. Further study is needed to master the full complexity of Kafka.

Distributed SystemsKafkaLog StorageReplicationMessage QueueConsumerproducer
Tencent Cloud Developer
Written by

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.