How Kafka Achieves High Performance: Producer, Broker, and Consumer Optimizations

This article explains why Kafka can handle up to 20 million messages per second and 600 MB/s throughput by detailing producer batching and custom protocols, broker page‑cache, file layout and zero‑copy techniques, as well as consumer group strategies for efficient message consumption.

Architect
Architect
Architect
How Kafka Achieves High Performance: Producer, Broker, and Consumer Optimizations

Some users report that a well‑tuned Kafka node can process nearly 20 million messages per second with a throughput of 600 MB/s, prompting the question of how such performance is achieved.

The analysis is divided into three parts: the producer side, the broker side, and the consumer side.

1) Producer Optimizations

The producer improves performance mainly through two mechanisms:

Batch sending : instead of sending each message immediately, the send() method buffers messages and transmits them in batches, reducing the number of requests the broker must handle.

Custom protocol format : serialization and compression shrink the payload size, saving network bandwidth.

Compression algorithms are compared as follows:

Throughput: LZ4 > Snappy > zstd > GZIP Compression ratio: zstd > LZ4 > GZIP >

Snappy

2) Broker Optimizations

The broker’s high performance stems from three key techniques: PageCache usage: writes are first cached in memory, then flushed to disk in batches, reducing disk I/O; reads also come from the cache, benefiting from high hit rates for recently written data.

Kafka’s file layout: each topic is split into partitions, each with its own directory, allowing parallel sequential writes that fully exploit disk I/O.

Zero‑copy ( sendfile) : data is transferred directly from PageCache to the socket buffer, bypassing user‑space copying and using DMA for faster transmission.

3) Consumer Optimizations

Consumers pull messages in batches from the leader partition. To increase consumption speed, Kafka supports consumer groups identified by group.id, allowing multiple consumers to share the load of a topic’s partitions.

Examples illustrate different group configurations: group.id = 1: one consumer handles all three partitions. group.id = 2: two consumers split the partitions 2‑1. group.id = 3: three consumers each handle one partition. group.id = 4: four consumers, one remains idle because partitions are fewer than consumers.

Overall, Kafka’s high throughput is achieved by combining producer batching and compression, broker‑side page‑cache, sequential file layout, zero‑copy data transfer, and flexible consumer group coordination.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performanceMessage QueueZero Copycompression
Architect
Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.