Big Data 10 min read

Kafka Storage Mechanism and Reliability Guarantees

This article explains Kafka's storage architecture—including topics, partitions, segments, and their naming rules—along with how data is read, and details the system's reliability features such as ISR/OSR replication, leader election, producer acknowledgment levels, and delivery guarantees.

IT Architects Alliance
IT Architects Alliance
IT Architects Alliance
Kafka Storage Mechanism and Reliability Guarantees

1. Kafka Storage Mechanism

Kafka stores data in topics, which contain partitions. Each partition is a directory on disk and is further divided into equal‑size segments.

A segment consists of a .log data file and a .index file that stores offsets and positions.

1.1 Segment

Segments split partition data into manageable files, enabling efficient deletion of old data.

.log

The .log file holds the actual record bytes.

.index

The .index file stores the mapping from offset to position in the .log file.

Naming rule

Segments are named by the offset of the last record in the previous segment, starting from 0 and padded to 20 digits.

1.2 Reading data

To read a record at a given offset, Kafka determines the segment containing the offset, looks up the offset in the segment’s index to find the start position in the .log file, and then reads the record according to the fixed format.

2. Reliability Guarantees

2.1 AR (In‑Sync and Out‑of‑Sync Replicas)

Kafka maintains an AR list (ISR + OSR). ISR replicas are fully synchronized with the leader; OSR replicas may lag.

ISR

Only when all ISR replicas have replicated the record is it considered committed and visible to consumers.

OSR

OSR replicas do not affect commit; they try to catch up, and may be promoted to ISR if they become up‑to‑date.

LEO

Log End Offset – the offset of the latest record written to the leader.

HW

High Watermark – the highest offset that has been replicated to all ISR replicas; only records up to HW are visible to consumers.

HW truncation

If a leader fails, followers truncate their logs to HW before synchronizing with the new leader, ensuring consistency.

2.2 Producer Acknowledgment Levels

Kafka provides three acks settings (0, 1, -1) that control when the leader replies to the producer, affecting durability and latency.

Setting request.required.acks to -1 (all) together with min.insync.replicas > 1 guarantees that a write is considered successful only when at least two replicas have stored it.

2.3 Leader Election

If the current leader fails, a follower from the ISR is elected as the new leader. The unclean.leader.election.enable flag controls whether a non‑ISR replica may be chosen, trading reliability for availability.

2.4 Delivery Guarantees

Kafka can provide “at‑most‑once”, “at‑least‑once”, and “exactly‑once” semantics; out of the box it guarantees at‑least‑once delivery, while exactly‑once requires additional deduplication logic.

KafkareliabilityStoragedistributedLeader Electionat-least-onceSegment
IT Architects Alliance
Written by

IT Architects Alliance

Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.