Why Kafka Is Fast: Partition Parallelism, Sequential Disk Writes, Page Cache, Zero‑Copy, Batching and Compression
The article explains how Kafka achieves high throughput by using partition‑level parallelism, sequential disk writes with segment files, extensive use of the OS page cache, zero‑copy data paths, request batching and optional compression, while also discussing the underlying disk I/O principles.
Whether Kafka is used as a message queue or a storage layer, it essentially provides two functions: producers write data to brokers and consumers read data from brokers. The speed of Kafka comes from optimizations in both reading and writing paths.
1. Partition Parallelism
Each Kafka topic consists of one or more partitions, which can reside on different broker nodes. This allows parallel processing across machines and, when partitions are placed on separate disks on the same node, parallel disk I/O, greatly improving throughput.
2. Sequential Disk Writes
Each partition is an ordered, immutable log. New messages are always appended to the end of the partition, enabling pure sequential writes. Deletion is performed by removing whole segment files rather than updating existing files, avoiding random writes.
3. Page Cache Utilization
Kafka writes data first to the Linux page cache, which aggregates small writes into larger physical writes and can serve reads directly from memory, reducing disk latency. Parameters flush.messages and flush.ms can force a flush, but are generally discouraged.
4. Zero‑Copy Techniques
4.1 Producer‑to‑Broker Path
Traditional I/O involves four copies and four context switches: network → kernel socket buffer → user buffer → kernel buffer → disk. Kafka can bypass user‑space copies by using memory‑mapped files (mmap) and direct I/O, reducing copies to two.
data = socket.read() // read network data</code>
<code>File file = new File()</code>
<code>file.write(data) // persist to disk</code>
<code>file.flush()4.2 Broker‑to‑Consumer Path
Instead of reading a file into user space and then sending it, Kafka uses the Linux sendfile system call (exposed via NIO transferTo/transferFrom) to move data directly from the page cache to the network NIC, cutting the operation down to two copies and two context switches.
buffer = File.read</code>
<code>Socket.send(buffer)5. Batching
Kafka batches multiple records before sending them over the network, reducing per‑message overhead and improving bandwidth utilization.
6. Compression
Producers can compress payloads using Snappy, Gzip or LZ4, further reducing network traffic when combined with batching.
Summary
Partition parallelism distributes load across nodes and disks.
Sequential writes leverage log‑structured storage and segment deletion.
Page cache turns disk I/O into memory operations.
Zero‑copy (mmap, sendfile) eliminates unnecessary data copies.
Batching and compression reduce network overhead.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Full-Stack Internet Architecture
Introducing full-stack Internet architecture technologies centered on Java
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
