How Kafka Stores Messages: Partitions, Segments, and Sparse Indexes Explained

This article explains Kafka's internal message storage mechanism, detailing how topics are divided into partitions, how partitions are segmented into LogSegments with data and index files, and how sparse indexing enables efficient offset lookups.

ITFLY8 Architecture Home
ITFLY8 Architecture Home
ITFLY8 Architecture Home
How Kafka Stores Messages: Partitions, Segments, and Sparse Indexes Explained

Introduction

Kafka messages are organized by topic; topics are independent and each can be split into partitions. Each partition stores a portion of messages. The official diagram illustrates the relationship between topics and partitions.

Partitions are stored as files in the filesystem. For example, a topic named page_visits with 5 partitions creates directories page_visits-0page_visits-4 under Kafka's log.dirs, each containing the data for that partition.

The article analyzes the storage format of files in a partition directory and the related code locations.

Partition Data Files

Each message in a partition has an offset that logically identifies it; the offset is not the physical file position but uniquely identifies a message. A message consists of three fields: offset (long), MessageSize (int32), and data (the payload). This format matches Kafka's MessageSet protocol.

Partition data files contain many such messages ordered by offset. The implementation class is FileMessageSet, whose main methods are:

append: writes messages from a ByteBufferMessageSet to the file.

searchFor: starting from a given position, finds the first message with offset greater than or equal to the target offset, returning its file position.

slice (read): returns a new FileMessageSet representing a portion of the file; it does not guarantee completeness of the sliced data.

sizeInBytes: reports the file size in bytes.

truncateTo: truncates the file, without guaranteeing message completeness at the truncation point.

readInto: reads file content from a relative position into a ByteBuffer.

If a partition had only one data file, new messages are appended at the end (O(1)), but searching for an offset would be linear and inefficient for large files.

Segmenting Data Files

Kafka improves lookup efficiency by splitting a data file into segments (LogSegments). For example, 100 messages with offsets 0‑99 can be divided into five segments of 20 messages each, each stored in a separate file named after the smallest offset in the segment. Binary search can then locate the correct segment.

Indexing Data Files

Each segment also has an index file with the same name but a .index extension. The index stores entries of relative offset and absolute position (both 4‑byte integers). Relative offsets reduce index size because each segment’s offsets start at a non‑zero base.

The index is sparse: it records an entry every few bytes, allowing the index to stay in memory while keeping space usage low. Missing entries require a short sequential scan.

The index implementation class is OffsetIndex, whose key methods are:

append: adds a pair of offset and position to the index, converting the offset to a relative offset.

lookup: binary searches for the greatest offset less than or equal to a given offset.

Summary

Kafka stores messages by topic, partition, segment, and sparse index. A partition consists of multiple LogSegments, each with a data file and a corresponding index file, enabling efficient appends (O(1)) and fast offset lookups via binary search and in‑memory sparse indexes.

Topic page_visits with 5 partitions results in a directory structure like the following:

Each partition is further divided into LogSegments, each containing a data file and an index file. An example partition may contain four LogSegments:

When looking up a message with absolute offset 7, Kafka first binary‑searches the LogSegments, then binary‑searches the corresponding index to find the nearest offset (6) and its file position (9807), and finally scans forward in the data file to locate offset 7.

Source: http://blog.csdn.net/jewes/article/details/42970799

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

KafkaPartitionMessage StorageSparse IndexLogSegment
ITFLY8 Architecture Home
Written by

ITFLY8 Architecture Home

ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.