Big Data 4 min read

How Kafka Achieves Billion-Message Throughput: Sequential Disk Writes, Page Cache, and Zero‑Copy

This article explains how Kafka sustains massive traffic by writing logs sequentially to disk, leveraging the operating system’s page cache for fast in‑memory writes, employing zero‑copy techniques like sendfile to avoid user‑space copying, and batching messages to reduce network overhead, thereby delivering high‑throughput, low‑latency streaming.

Mike Chen's Internet Architecture
Mike Chen's Internet Architecture
Mike Chen's Internet Architecture
How Kafka Achieves Billion-Message Throughput: Sequential Disk Writes, Page Cache, and Zero‑Copy

Sequential Disk Writes

Kafka achieves high throughput by writing messages sequentially to disk. Each topic partition corresponds to a log file; new messages are always appended at the end, receiving a unique offset. Sequential writes keep the disk head movement minimal, enabling fast write speeds.

Kafka illustration
Kafka illustration
<ol>
<li>Partition-0
  <ul>
    <li>00000000000000000000.log</li>
    <li>00000000000000000000.index</li>
    <li>00000000000000000000.timeindex</li>
    <li>00000000000000000100.log</li>
    <li>...</li>
  </ul>
</li>
</ol>

Page Cache

When a producer sends a message, the broker writes it to the operating system’s page cache instead of directly to disk. The page cache holds recent data in memory, allowing rapid writes. The OS later flushes the cache to the .log files asynchronously, balancing throughput and durability.

最新文章
最新文章

Zero‑Copy Transfer

When a broker delivers messages to a consumer, Kafka uses the OS zero‑copy mechanism (e.g., sendfile) via FileChannel.transferTo(). Data moves directly from page cache to the network socket buffer without copying through user space, reducing CPU usage and context switches.

最新文章
最新文章

Batch Sending

Producers batch multiple records into a single request, and consumers pull messages in batches. Batch processing reduces the number of network round‑trips, lowers overhead, and further increases throughput.

最新文章
最新文章
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

KafkaZero CopyHigh Throughputpage cacheBatching
Mike Chen's Internet Architecture
Written by

Mike Chen's Internet Architecture

Over ten years of BAT architecture experience, shared generously!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.