Big Data 4 min read

Kafka High‑Throughput Tricks: Sequential Writes, Zero‑Copy, Partitioning

The article explains how Kafka achieves high throughput by writing messages sequentially to disk, leveraging OS page cache and zero‑copy system calls, using partitioned topics for parallelism, batching and compressing records on both producer and broker sides, and employing asynchronous replication with configurable persistence strategies.

Mike Chen's Internet Architecture

Dec 3, 2025

Kafka High‑Throughput Tricks: Sequential Writes, Zero‑Copy, Partitioning

Sequential Writes and Zero‑Copy

Kafka writes messages to disk in a strictly sequential order per partition, avoiding random I/O; even spinning disks can approach memory‑level speeds when writes are sequential.

Each partition corresponds to an ordered log file, and Kafka relies on the operating system’s page cache for buffering instead of managing its own cache.

Zero‑copy is achieved via the sendfile() system call, which transfers data directly between kernel buffers of the file and the network socket, eliminating copies to user space, reducing I/O overhead, CPU usage, and increasing read/write speed.

Partitioning and Parallel Processing

The core data model is Topic → Partition → Replica. Each partition is stored and consumed independently, and partitions can be distributed across multiple brokers.

Within a consumer group, each consumer can read from different partitions in parallel, allowing horizontal scaling. For example, a topic split into 10 partitions can be consumed by 10 consumers simultaneously, theoretically increasing throughput tenfold.

Batch Transmission and Compression

Kafka improves network efficiency by batching messages and compressing them.

On the producer side, the batch.size and linger.ms settings aggregate multiple records into a single batch before sending.

On the broker side, batches are written to the log file in one operation, reducing disk I/O.

Supported compression algorithms include gzip, snappy, lz4, and zstd, which can cut network bandwidth usage by more than 70%.

Asynchronous Replication and Configurable Persistence

Kafka uses Java NIO (non‑blocking I/O) and a selector‑based reactor model to handle thousands of connections with a single thread, minimizing context switches.

Zero‑copy sendfile() is also used for replicating data between brokers, moving data directly from disk to the network interface.

Broker responses are batched, allowing multiple acknowledgments to be sent together, which lowers latency and keeps CPU utilization low even under high concurrency.

Persistence policies can be tuned (e.g., acknowledgment levels, log retention) to balance durability and performance.