How Kafka Achieves Ultra‑High Throughput: Sequential I/O, Zero‑Copy, and More

This article explains how Kafka’s design—using sequential disk reads/writes, zero‑copy system calls, file segmentation, batch sending, and message compression—delivers massive throughput while minimizing performance loss and network load.

Java High-Performance Architecture
Java High-Performance Architecture
Java High-Performance Architecture
How Kafka Achieves Ultra‑High Throughput: Sequential I/O, Zero‑Copy, and More

Kafka is a distributed messaging system designed to handle massive volumes of messages. It writes all messages to large‑capacity disks, trading off little performance loss for strong storage capability.

Sequential Read/Write

Kafka appends messages to files, leveraging the sequential read/write performance of disks. Sequential I/O avoids seek time, requiring only minimal sector rotation, making it far faster than random I/O. Official test data (Raid‑5, 7200 rpm) shows sequential I/O at 600 MB/s versus random I/O at 100 KB/s.

Zero‑Copy

In a typical file‑to‑network transfer, data moves from user space to kernel space and then to the network socket, involving multiple copies. Zero‑copy system calls introduced after Linux kernel 2.2 map disk space directly to memory, eliminating the user‑buffer copy and reducing context switches to two, roughly doubling performance.

File Segmentation

Kafka topics are divided into partitions, each further split into segments. Thus messages are stored across many segment files. This segmentation means each file operation deals with a small file, making I/O lightweight and enhancing parallel processing.

Batch Sending

Kafka batches messages in memory before sending them in a single request. Producers can trigger a send when a certain number of messages accumulate (e.g., 100 messages) or after a time interval (e.g., every 5 seconds), greatly reducing server I/O operations.

Data Compression

Kafka supports compressing message batches using GZIP or Snappy. Compression reduces the amount of data transmitted, easing network load. Although consumers must decompress, the CPU overhead is acceptable because network bandwidth, not CPU, is the bottleneck in large‑scale data processing.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Zero Copydata compressionDistributed MessagingHigh Throughputbatch sending
Java High-Performance Architecture
Written by

Java High-Performance Architecture

Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.