Mastering Kafka High Concurrency: Practical Configurations for Million‑TPS Throughput
This guide explains what constitutes high concurrency for Kafka, presents throughput benchmarks, and provides detailed broker‑level configuration tips—including partition planning, producer batching, storage optimization, and zero‑copy settings—to achieve scalable, low‑latency message processing.
High concurrency is a core performance metric for internet and distributed system architectures, referring to the ability to handle a large number of simultaneous requests.
What qualifies as high concurrency for Kafka
Single‑node write throughput of 100,000 messages/s is considered high concurrency.
Cluster throughput exceeding 1,000,000 messages/s represents a typical high‑concurrency Kafka scenario.
Throughput in the tens of millions of TPS is characteristic of large‑scale “ultra‑high” deployments.
Broker‑level configuration tips
Kafka achieves high concurrency through partition parallelism, sequential writes, zero‑copy transmission, and batch processing.
Partitions and replicas : Increase partition count to boost parallel consumption and writes, but avoid excessive partitions that increase metadata and leader election overhead. Keep replica count at 2‑3 for availability. Adjust ISR and ack settings for low latency while balancing data safety.
Producer settings : Use asynchronous and batch sending (tune batch.size and linger.ms) to combine small messages, enable compression (snappy, lz4) to reduce bandwidth and disk I/O, and configure acks (0/1/all) and retry policies to balance reliability and throughput.
Broker storage optimization : Prefer SSDs for sequential writes, ensure sufficient disk bandwidth and low‑latency writes. Increase num.io.threads and num.network.threads for parallel I/O and network handling. Set appropriate log.segment.bytes and log.retention to avoid excessive small files and file‑handle pressure.
Zero‑copy : Enable the kernel’s sendfile mechanism to eliminate data copies between user and kernel space, improving network efficiency. Verify OS and JVM support for sendfile and tune network parameters such as socket.buffer and socket.request.max.bytes to match message size and throughput requirements.
These configurations, combined with proper hardware selection and network tuning, allow Kafka clusters to sustain high‑throughput, low‑latency workloads ranging from hundreds of thousands to tens of millions of messages per second.
Architect Chen
Sharing over a decade of architecture experience from Baidu, Alibaba, and Tencent.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
