Big Data 22 min read

Why Kafka Stores Data the Way It Does: Inside Its Architecture

This article provides an in‑depth technical analysis of Kafka’s storage architecture, covering its design goals, storage mechanisms, log segment layout, sparse indexing, log cleanup policies, and the performance techniques such as sequential writes, page cache, and zero‑copy that enable high‑throughput streaming.

ITPUB

Oct 26, 2022

Why Kafka Stores Data the Way It Does: Inside Its Architecture

1. Kafka Storage Scenario Analysis

Kafka was created at LinkedIn to handle real‑time log streams at a scale of billions of events per day, requiring high concurrency, high availability, and high performance. The storage system must efficiently persist massive message streams, support fast retrieval by offset or timestamp, and ensure durability across broker failures.

2. Kafka Storage Options

Using a traditional relational database with B+‑tree indexes would be unsuitable because each write would need to maintain the index, consuming extra space and CPU, and causing write amplification. A hash index could provide O(1) lookups but would require the entire hash table to reside in memory, which is impractical for millions of messages per second. Kafka instead adopts a sparse index strategy: each log segment stores an index entry only for the first message of the segment, allowing binary‑search‑like lookup without the memory overhead of a full hash table.

3. Kafka Storage Architecture Design

Kafka organizes data as topics → partitions → log segments → index files . A topic is a logical stream; each topic is split into multiple partitions for horizontal scalability. Partitions are further divided into log segments to keep individual files manageable and to simplify cleanup. Every partition guarantees ordering only within itself, not across partitions.

Each log segment consists of a .log file and three index files: .index (offset index), .timeindex (timestamp index), and optional .snapshot files. The base offset of a segment (a 64‑bit long) is used to name all files, e.g., 00000000000000001234.log. The offset index maps logical offsets to physical file positions, while the timestamp index enables time‑based queries.

4. Kafka Log System Architecture

Messages are appended sequentially to the active log segment; only the newest segment is writable. When the active segment reaches a configured size, a new active segment is created. Consumers store their committed offsets in the internal __consumer_offsets topic.

The directory layout follows the pattern <topic>-<partition>, e.g., topic-order-0, topic-order-1, etc. Within each partition directory, log segments and their index files are stored side‑by‑side.

5. Log Cleanup Mechanisms

Kafka provides two primary cleanup strategies controlled by the broker configuration log.cleanup.policy:

Log Retention (delete) : Periodically removes whole log segments that exceed a time‑based threshold ( retention.ms) or a size‑based threshold ( log.retention.bytes). Deletion proceeds by first removing the segment from the in‑memory jump table, renaming files with a .deleted suffix, and finally deleting them after a configurable delay.

Log Compaction (compact) : Retains only the latest record for each key, discarding older versions. This is useful when applications only need the most recent value per key.

Both policies can be combined by setting log.cleanup.policy=delete,compact. Retention checks run every log.retention.check.interval.ms (default 5 minutes). Size‑based retention uses log.segment.bytes to define segment size (default 1 GB) and log.retention.bytes for the total allowed log size.

6. Performance Techniques

Kafka relies heavily on the operating system’s page cache to turn disk I/O into memory access, achieving high throughput. Additionally, Kafka employs zero‑copy transfers to move data between the network stack and disk buffers without copying between user and kernel space.

7. Summary

Starting from the storage requirements of massive real‑time log streams, we examined Kafka’s design choices, including sequential append writes, sparse indexing, partitioned log segments, and configurable cleanup policies. These choices together enable Kafka to provide durable, high‑throughput storage while keeping resource usage efficient.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data storage architecture log cleanup Sparse Index Log Segments

Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.