Kafka Storage Mechanism and Reliability Guarantees
This article explains Kafka's storage architecture—including topics, partitions, segments, and their naming rules—along with how data is read, and details the system's reliability features such as ISR/OSR replication, leader election, producer acknowledgment levels, and delivery guarantees.
1. Kafka Storage Mechanism
Kafka stores data in topics, which contain partitions. Each partition is a directory on disk and is further divided into equal‑size segments.
A segment consists of a .log data file and a .index file that stores offsets and positions.
1.1 Segment
Segments split partition data into manageable files, enabling efficient deletion of old data.
.log
The .log file holds the actual record bytes.
.index
The .index file stores the mapping from offset to position in the .log file.
Naming rule
Segments are named by the offset of the last record in the previous segment, starting from 0 and padded to 20 digits.
1.2 Reading data
To read a record at a given offset, Kafka determines the segment containing the offset, looks up the offset in the segment’s index to find the start position in the .log file, and then reads the record according to the fixed format.
2. Reliability Guarantees
2.1 AR (In‑Sync and Out‑of‑Sync Replicas)
Kafka maintains an AR list (ISR + OSR). ISR replicas are fully synchronized with the leader; OSR replicas may lag.
ISR
Only when all ISR replicas have replicated the record is it considered committed and visible to consumers.
OSR
OSR replicas do not affect commit; they try to catch up, and may be promoted to ISR if they become up‑to‑date.
LEO
Log End Offset – the offset of the latest record written to the leader.
HW
High Watermark – the highest offset that has been replicated to all ISR replicas; only records up to HW are visible to consumers.
HW truncation
If a leader fails, followers truncate their logs to HW before synchronizing with the new leader, ensuring consistency.
2.2 Producer Acknowledgment Levels
Kafka provides three acks settings (0, 1, -1) that control when the leader replies to the producer, affecting durability and latency.
Setting request.required.acks to -1 (all) together with min.insync.replicas > 1 guarantees that a write is considered successful only when at least two replicas have stored it.
2.3 Leader Election
If the current leader fails, a follower from the ISR is elected as the new leader. The unclean.leader.election.enable flag controls whether a non‑ISR replica may be chosen, trading reliability for availability.
2.4 Delivery Guarantees
Kafka can provide “at‑most‑once”, “at‑least‑once”, and “exactly‑once” semantics; out of the box it guarantees at‑least‑once delivery, while exactly‑once requires additional deduplication logic.
IT Architects Alliance
Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.