Why Kafka Can Achieve Million‑Message‑Per‑Second Throughput: Disk Sequential Write, Zero‑Copy, Page Cache, and Memory‑Mapped Files
The article explains how Kafka attains ultra‑high write throughput by leveraging disk sequential writes, zero‑copy data transfer, operating‑system page cache, and memory‑mapped files, detailing each technique’s impact on latency, CPU usage, and overall performance.
Kafka can handle millions of messages per second, and this performance stems from four core techniques: disk sequential writes, zero‑copy transmission, page‑cache utilization, and memory‑mapped files.
Disk Sequential Write
Kafka writes logs sequentially, avoiding random I/O. Sequential writes reduce the costly seek and rotation phases of mechanical disks, and even on SSDs they outperform random writes due to lower flash‑block management overhead.
Zero‑Copy
Traditional data paths copy data from disk to application memory and then to the network buffer, incurring multiple copies. Kafka’s zero‑copy sends data directly from the OS page cache to the network interface, eliminating these extra copies, lowering CPU and memory‑bandwidth load, and boosting throughput.
Page Cache
When Kafka writes to a file, the data first lands in the operating system’s page cache. The OS asynchronously flushes dirty pages to disk, allowing Kafka to return from write calls quickly and increasing overall throughput while reducing latency.
Memory‑Mapped Files
Kafka maps log files into the process address space using mmap , enabling direct memory access to file contents without explicit read/write syscalls. This leverages the OS virtual‑memory subsystem for efficient large‑scale data handling.
Combined, these mechanisms dramatically cut I/O latency, reduce CPU overhead, and enable Kafka’s ability to sustain extremely high write rates.
Mike Chen's Internet Architecture
Over ten years of BAT architecture experience, shared generously!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.