How Kafka Achieves Billion-Message Throughput: Sequential Disk Writes, Page Cache, and Zero‑Copy
This article explains how Kafka sustains massive traffic by writing logs sequentially to disk, leveraging the operating system’s page cache for fast in‑memory writes, employing zero‑copy techniques like sendfile to avoid user‑space copying, and batching messages to reduce network overhead, thereby delivering high‑throughput, low‑latency streaming.
Sequential Disk Writes
Kafka achieves high throughput by writing messages sequentially to disk. Each topic partition corresponds to a log file; new messages are always appended at the end, receiving a unique offset. Sequential writes keep the disk head movement minimal, enabling fast write speeds.
<ol>
<li>Partition-0
<ul>
<li>00000000000000000000.log</li>
<li>00000000000000000000.index</li>
<li>00000000000000000000.timeindex</li>
<li>00000000000000000100.log</li>
<li>...</li>
</ul>
</li>
</ol>Page Cache
When a producer sends a message, the broker writes it to the operating system’s page cache instead of directly to disk. The page cache holds recent data in memory, allowing rapid writes. The OS later flushes the cache to the .log files asynchronously, balancing throughput and durability.
Zero‑Copy Transfer
When a broker delivers messages to a consumer, Kafka uses the OS zero‑copy mechanism (e.g., sendfile) via FileChannel.transferTo(). Data moves directly from page cache to the network socket buffer without copying through user space, reducing CPU usage and context switches.
Batch Sending
Producers batch multiple records into a single request, and consumers pull messages in batches. Batch processing reduces the number of network round‑trips, lowers overhead, and further increases throughput.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Mike Chen's Internet Architecture
Over ten years of BAT architecture experience, shared generously!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
