Big Data 27 min read

Why Kafka Handles Millions of Messages Per Second: Inside Its High‑Performance Architecture

This article explains how Kafka achieves ultra‑high throughput and low latency despite being disk‑based, covering its Reactor I/O network model, zero‑copy techniques, partitioning strategies, segment logs with sparse indexes, sequential disk writes, page cache usage, compression, batch processing, and lock‑free offset management.

Sanyou's Java Diary

Oct 10, 2024

Why Kafka Handles Millions of Messages Per Second: Inside Its High‑Performance Architecture

01 Kafka Reactor I/O Network Model

Kafka uses a non‑blocking I/O model based on the Reactor pattern, where one or more Java Selector s monitor multiple channels and dispatch events to appropriate handlers, allowing thousands of connections to be served with a few threads.

The model consists of three roles:

Reactor : dispatches I/O events to handlers.

Acceptor : handles new client connections.

Handler : processes read/write tasks.

Key components include:

SocketServer : manages all network connections and initializes Acceptor and Processor threads.

Acceptor : listens for connections using Selector and registers OP_ACCEPT events, creating a SocketChannel for each new client and handing it to a Processor.

Processor : performs actual I/O, using its own Selector to handle OP_READ and OP_WRITE for multiple SocketChannel s.

RequestChannel : buffers requests and responses between Processors and request‑handling threads.

KafkaRequestHandler : reads requests from RequestChannel, invokes KafkaApis, and writes responses back; its thread pool size is controlled by num.io.threads.

02 Zero‑Copy Technology

Zero‑copy avoids copying data between kernel and user space, reducing CPU load and memory bandwidth.

Kafka employs two main mechanisms:

sendfile() system call: transfers data directly from disk to a network socket without user‑space copying.

Memory‑Mapped Files (mmap) : maps log files into memory so reads/writes occur without extra copies.

Traditional I/O involves four copies: read → kernel buffer, copy to user buffer, write → kernel network buffer, DMA to NIC. Zero‑copy eliminates the middle copies, improving throughput.

03 Partition Concurrency and Load Balancing

Kafka’s architecture consists of Producers, Consumers, Brokers, Topics, Partitions, Replicas, and ZooKeeper.

Producer : publishes messages to a Topic.

Consumer : subscribes to a Topic; each ConsumerGroup can consume a subset of Partitions.

Broker : a server that stores partitions; a Topic’s data can be spread across multiple Brokers.

Topic & Partition : a Topic is split into ordered Partitions; each Partition is a log file on disk.

Replica : each Partition has multiple copies for high availability; one replica is the leader.

ZooKeeper : coordinates cluster metadata.

Partition assignment strategies:

Round‑Robin : sequentially distributes messages across partitions.

Random : assigns messages to random partitions.

Key‑based (message‑key) ordering : messages with the same key always go to the same partition.

Geographic : selects partitions based on broker location (example code shown below).

List<PartitionInfo> partitions = cluster.partitionsForTopic(topic);
return partitions.stream()
  .filter(p -> isSouth(p.leader().host()))
  .map(PartitionInfo::partition)
  .findAny()
  .get();

04 Segment Log Files and Sparse Indexes

Each Partition is stored as a series of immutable log segments. A segment consists of several files: .index: sparse offset index. .timeindex: sparse timestamp index (added in 0.8). .log: actual message data. .snapshot: producer transaction information. .swap: used for segment recovery. .txnindex: records incomplete transactions.

Kafka writes messages sequentially to the active segment; when a size threshold is reached, a new segment is created. Offsets are never reused.

Lookup of a message by offset uses binary search on the segment index to find the nearest offset ≤ target, then reads the log file at the corresponding position.

04.2 mmap

Kafka maps index files into memory with MappedByteBuffer, allowing fast binary search without extra system calls.

05 Sequential Disk Read/Write

Random I/O on spinning disks is 3‑4 orders of magnitude slower than sequential I/O. By appending messages to the end of a log file, Kafka minimizes seek and rotation latency, achieving much higher throughput.

06 PageCache

When a Producer writes, Kafka uses pwrite() to write into the OS page cache. Consumers receive data via sendfile(), which transfers bytes directly from the page cache to the network socket, leveraging zero‑copy. Leader‑follower replication follows the same pattern.

07 Data Compression and Batch Processing

Supported compression algorithms: GZIP, Snappy, LZ4, and Zstandard (since 2.1.0). Producers configure compression.type, batch messages, compress the batch, and store the compressed batch in the log. Consumers decompress the batch automatically.

Batching reduces network overhead and improves throughput. Producer processing pipeline:

Serialize key/value.

Determine target Partition.

Optionally compress the batch.

Accumulate records per Partition until size or linger time is reached.

Send batches via a dedicated thread.

Consumers can fetch multiple messages in one request using fetch.min.bytes and fetch.max.wait.ms.

08 Lock‑Free Lightweight Offset

Offsets identify a message’s position within a Partition. Kafka’s design avoids locks by using sequential writes, memory‑mapped files, zero‑copy, and batch processing, enabling high concurrency.

08.2 Consumer Offset Management Flow

graph TD;
    A[Start Consumer] --> B[Read messages from partition];
    B --> C[Process message];
    C --> D{Processing successful?};
    D -->|Yes| E[Update Offset];
    D -->|No| F[Record failure, retry];
    E --> G[Commit Offset];
    G --> H[Proceed to next message];
    F --> B;
    H --> B;

Consumers commit offsets per Partition so that after a crash they can resume from the last committed position.

09 Summary

Kafka achieves high performance through a combination of a non‑blocking Reactor I/O model, sequential disk writes, memory‑mapped sparse indexes, zero‑copy transfers, efficient compression, batch processing, and lock‑free offset management.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

performance kafka Zero‑copy Segment Offset Management Reactor I/O

Written by

Sanyou's Java Diary

Passionate about technology, though not great at solving problems; eager to share, never tire of learning!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.