Boost Kafka to Over 1 Million Messages per Second: Metrics and Tuning Tips

This article explains what high concurrency means for Kafka, outlines key performance metrics such as QPS, TPS, throughput and latency, and provides concrete configuration and architectural techniques—including broker optimization, horizontal scaling, network batching, and zero‑copy—to achieve write rates exceeding one million records per second.

Mike Chen's Internet Architecture
Mike Chen's Internet Architecture
Mike Chen's Internet Architecture
Boost Kafka to Over 1 Million Messages per Second: Metrics and Tuning Tips

What Is High Concurrency in Distributed Systems?

High concurrency refers to a system's ability to handle a large number of requests simultaneously. In Kafka, it is measured by the number of messages processed per second rather than the speed of a single thread.

Key Performance Indicators

QPS (Queries Per Second) : Number of requests per second.

TPS (Transactions Per Second) : Number of committed transactions per second.

Throughput : Amount of data processed per second (e.g., MB/s, GB/s).

Latency : Time from request issuance to response.

Typical Throughput Benchmarks

• Single‑broker (1 Broker) – 5 ~ 10 × 10⁴ messages/s (writes > 10⁵ msg/s are considered high).

• Small cluster (3 Brokers) – 30 ~ 50 × 10⁴ messages/s.

• Medium cluster (6‑10 Brokers) – 100 ~ 300 × 10⁴ messages/s.

• Large cluster (20+ Brokers) – 5 ~ 10 × 10⁶ TPS, typical of “high‑concurrency Kafka” scenarios.

Core Techniques for Achieving High Concurrency

1. Broker‑Layer Optimisation

Increase partitions to create more parallel write threads.

Leverage sequential disk writes with page cache + zero copy to approach disk‑level throughput.

Enable batch writes using batch.size and linger.ms to coalesce requests.

Boost I/O parallelism by configuring num.io.threads and num.network.threads.

2. Horizontal Scaling of Brokers

Adding more broker nodes distributes write load across the cluster, reducing pressure on any single machine.

3. Network and Batch Transmission

Configure asynchronous batch sending:

acks=1
linger.ms=10
batch.size=32768
compression.type=lz4

4. Zero‑Copy Support

Ensure the operating system enables zero‑copy (e.g., sendfile) – a critical factor for Kafka's high‑throughput capability.

Example Configuration Snippet

log.dirs=/data1/kafka,/data2/kafka,/data3/kafka

Using multiple disk paths improves I/O performance by spreading writes across disks.

Illustrative Diagrams

Kafka high concurrency diagram
Kafka high concurrency diagram
Kafka performance scaling
Kafka performance scaling
backenddistributed-systemsKafkaPerformance TuningHigh Concurrency
Mike Chen's Internet Architecture
Written by

Mike Chen's Internet Architecture

Over ten years of BAT architecture experience, shared generously!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.