Big Data 35 min read

Why Kafka’s Message System Is Essential for High‑Throughput Applications

This article explains why a message system like Kafka is crucial for decoupling services, handling asynchronous workflows such as e‑commerce flash sales, controlling traffic, and achieving high concurrency, high availability, and high performance through sequential disk writes, zero‑copy reads, replication, and careful resource planning.

Open Source Linux

Aug 30, 2021

Why Kafka’s Message System Is Essential for High‑Throughput Applications

1. Why Use a Message System

Decoupling of components.

Asynchronous processing – e.g., a flash‑sale flow: risk control, inventory lock, order generation, SMS notification, data update.

Flow control – the gateway enqueues requests, backend services consume them, limiting traffic but adding latency.

2. Core Kafka Concepts

Producer : sends data to a Kafka cluster. Consumer : pulls data from Kafka for processing. Topic : logical channel for messages. Partition : physical storage unit; a topic can have multiple partitions distributed across brokers.

3. Kafka Cluster Architecture

Each Kafka server is a broker . A Consumer Group is identified by a group‑id; members of the same group share the consumption of a topic’s partitions. The Controller node (selected via ZooKeeper) coordinates cluster metadata.

4. Sequential Disk Writes for Performance

Kafka writes logs sequentially (append‑only) to the OS page cache and then flushes them to disk, achieving write speeds comparable to memory writes.

Producer → OS cache → sequential disk write

5. Zero‑Copy Read Path

The consumer read flow:

Consumer sends request to Kafka.

Kafka reads data from OS cache (or disk if not cached).

Data is copied from OS cache to the Kafka process.

Kafka transfers the data to the socket cache.

Network card sends the data to the consumer.

6. Log Segmentation

A topic is divided into partitions; each partition consists of multiple log segment files (default 1 GB each). For example, creating topic_a with three partitions creates three directories on three brokers, each containing a series of .log files.

7. Binary Search for Data Location

Each message has an offset (relative) and a position (physical file offset). Kafka uses a sparse index: every 4 KB of log a new index entry is written, enabling binary search to locate messages quickly.

8. High‑Concurrency Network Design (NIO)

Kafka’s network stack is built on Java NIO with a reactor pattern, using multiple selectors, threads, and queues to achieve high throughput.

9. Replication for High Availability

Each partition has a leader and one or more followers. Writes go to the leader; once the data is replicated to the ISR (in‑sync replica) list, the write is considered committed. If a leader fails, a follower is elected.

10. Architecture Summary

Kafka achieves high concurrency, high availability, and high performance through:

Writing first to OS cache, then sequential disk writes.

Sparse index + binary search for fast reads.

Zero‑copy (sendfile) to avoid data copying between kernel and user space.

Multi‑selector, multi‑thread, queue‑based NIO network design.

Leader‑follower replication.

11. Production‑Environment Planning

11.1 Demand Analysis

10⁹ requests per day, 50 KB each → ~46 TB raw data, 3‑day retention → ~276 TB total storage.

Assuming 2 replicas, the required storage is 276 TB.

11.2 Physical Machine Count

Peak QPS ≈ 55 k req/s. Planning for 4× peak → 200 k req/s, which suggests about 5 physical servers (≈4 × 10⁴ req/s per server).

11.3 Disk Selection

Sequential writes dominate, so ordinary SAS disks (≈7 TB each) are sufficient. Five servers × 11 disks × 7 TB ≈ 385 TB raw capacity.

11.4 Memory Estimation

Kafka relies heavily on OS page cache; allocate ~10 GB JVM heap plus enough memory for the page cache (≈60 GB total per server).

11.5 CPU Estimation

Kafka brokers run dozens of threads; a safe baseline is 16 cores per server (32 cores preferred).

11.6 Network Requirements

Peak inbound traffic ≈ 1 GB/s per server; 10 GbE NICs are recommended.

11.7 Cluster Planning

Three ZooKeeper nodes manage metadata; Kafka brokers can share the same hosts when resources are limited.

11.8 Custom Partitioner Example

public class HotDataPartitioner implements Partitioner {
    private Random random;
    @Override
    public void configure(Map<String, ?> configs) { random = new Random(); }
    @Override
    public int partition(String topic, Object keyObj, byte[] keyBytes, Object value, byte[] valueBytes, Cluster cluster) {
        String key = (String) keyObj;
        List<PartitionInfo> partitions = cluster.availablePartitionsForTopic(topic);
        int partitionCount = partitions.size();
        int hotDataPartition = partitionCount - 1;
        return !key.contains("hot_data") ? random.nextInt(partitionCount - 1) : hotDataPartition;
    }
}

Configure with

props.put("partitioner.class", "com.zhss.HotDataPartitioner")

12. Kafka Producer

12.1 Sending Principle

Producer buffers records in memory (default 32 MB) before sending them in batches.

12.2 Throughput Tuning Parameters

buffer.memory

: size of the producer’s buffer (default 32 MB). compression.type: e.g., lz4 to reduce payload size. batch.size: batch size before a request is sent (default 16 KB). linger.ms: delay to allow batches to fill (default 0 ms).

12.3 Exception Handling

LeaderNotAvailableException

: retry after leader election. NotControllerException: retry after controller election. NetworkException: configure retries or fallback to alternative storage.

12.4 Retry Mechanism

Retries can cause duplicate or out‑of‑order messages; set max.in.flight.requests.per.connection=1 to preserve order.

12.5 ACK Configuration

acks=0

: fire‑and‑forget (high throughput, possible loss). acks=1: wait for leader acknowledgment. acks=-1 (or all): wait for all ISR replicas (no loss).

13. Kafka Consumer

13.1 Consumer Group Concept

Consumers sharing the same group.id belong to one consumer group; each partition is assigned to only one consumer within the group. Different groups can consume the same topic independently (broadcast).

13.2 Offset Management

Offsets are stored in the internal __consumer_offsets topic (compact‑ed). Older versions stored offsets in ZooKeeper.

13.3 Rebalance Strategies

Range: contiguous partition ranges per consumer.

Round‑Robin: cyclic distribution of partitions.

Sticky: tries to keep existing assignments and only moves the minimum number of partitions.

13.4 Core Consumer Parameters

fetch.max.bytes

: max bytes per fetch request (default 1 MB). max.poll.records: max records returned per poll (default 500). enable.auto.commit / auto.commit.interval.ms: automatic offset commits. session.timeout.ms, heartbeat.interval.ms: liveness detection. max.poll.interval.ms: max processing time before the consumer is considered dead.

14. Broker Management

14.1 LEO and HW

LEO (Log End Offset) is the offset of the next record to be written. HW (High Watermark) is the highest offset that all in‑sync replicas have replicated; only records up to HW are visible to consumers.

14.2 Controller Role

The controller monitors broker registrations under /broker/ids/, handles topic creation, partition reassignment, and leader election.

14.3 Delayed Tasks

Kafka uses a custom time‑wheel scheduler for delayed operations such as acks=-1 timeouts and follower fetch delays, achieving O(1) insertion and removal.

14.4 Time‑Wheel Mechanism

A multi‑level wheel (e.g., 1 ms tick × 20 slots, then 20 ms wheel, etc.) provides efficient scheduling for thousands of delayed tasks.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

distributed systems Kafka High concurrency replication Message Queue Zero‑copy resource planning

Written by

Open Source Linux

Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.