How Kafka Achieves Ultra‑High Throughput: Sequential Writes, Page Cache, Batching & Zero‑Copy
This article explains the core Kafka techniques—sequential disk writes, Linux page cache usage, producer/broker batching with compression, and zero‑copy data transfer via sendfile—that together enable massive concurrency and near‑memory write performance in large‑scale streaming architectures.
Kafka is a cornerstone of large‑scale architectures, and its ability to handle massive concurrency stems from several low‑level optimizations.
Sequential Write
Kafka writes messages to disk in an append‑only (sequential) fashion. Sequential I/O avoids costly seek operations, allowing the disk head to stay stationary or move smoothly, which maximizes throughput by fully utilizing the media’s bandwidth.
Page Cache
Instead of flushing each write to disk, Kafka relies on the Linux page cache. Messages are stored in memory‑mapped files that map directly into the page cache, and the operating system asynchronously writes these pages to disk. This approach makes write performance approach that of pure memory writes.
Batch Sending and Pulling
Both producers and brokers batch multiple records into a single request, reducing network round‑trips and I/O load. Kafka also supports compression (e.g., gzip, snappy, lz4), which further lowers bandwidth consumption and boosts effective throughput.
Zero‑Copy Transfer
When a broker sends data to a consumer, it uses the sendfile() system call to move data directly from the disk file to the socket buffer, bypassing user‑space copies. This eliminates the traditional user‑kernel‑disk copy cycle, dramatically reducing CPU usage and memory copying.
By combining sequential writes, page‑cache persistence, batching with compression, and zero‑copy data transfer, Kafka can sustain extremely high throughput while keeping CPU consumption low.
Architect Chen
Sharing over a decade of architecture experience from Baidu, Alibaba, and Tencent.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
