Big Data 21 min read

How Kafka’s Architecture and Memory Pool Reduce JVM GC for High Throughput

This article explains how Kafka’s design—its broker architecture, use of sequential disk I/O, PageCache, Sendfile, and a custom memory buffer pool—optimizes JVM garbage collection and achieves massive throughput in big‑data messaging scenarios.

Java Backend Technology

Jun 16, 2020

How Kafka’s Architecture and Memory Pool Reduce JVM GC for High Throughput

Kafka Architecture

Kafka is a high‑throughput message queue widely used in big‑data scenarios; to understand how it handles massive data volumes, we first review its architecture and then discuss GC optimization techniques.

Topic : Logical grouping of messages; a topic can span multiple brokers.

Partition : The unit of horizontal scaling and parallelism; each topic has at least one partition.

Offset : Sequential number of a message within a partition.

Consumer : Retrieves messages from brokers.

Producer : Sends messages to brokers.

Replication : Redundant copies of a partition; each partition can have one or more replicas.

Leader : The sole replica that handles all read/write requests for a partition.

Broker : Handles producer and consumer requests and persists messages to local disk; one broker in the cluster acts as the controller.

ISR (In‑Sync Replica) : Subset of replicas that are alive and caught up with the leader.

These concepts form the core of Kafka’s design and are essential for effective tuning.

Broker

Unlike in‑memory queues such as Redis, Kafka writes all messages to high‑capacity disks. It mitigates the performance impact by performing only sequential I/O, achieving around 600 MB/s for sequential reads versus 100 KB/s for random reads.

Kafka heavily relies on the operating system’s PageCache . Writes go to PageCache (marked dirty) and reads are served from it, avoiding unnecessary heap allocations and reducing GC pressure.

Additional optimizations include the use of Sendfile , which eliminates two context switches and two system calls by copying data directly from kernel buffers to the network socket.

In production clusters (e.g., 20 brokers, 75 partitions per broker, 110 k msgs/s) write traffic reaches ~10 MB/s per partition replication, while read traffic remains under 50 KB/s thanks to PageCache. When data is evicted to disk, read rates can rise to 40 + MB/s without degrading performance.

Kafka does not recommend forcing disk flushes via log.flush.interval.messages or log.flush.interval.ms; reliability should rely on replication.

Performance can be tuned by adjusting /proc/sys/vm/dirty_background_ratio and /proc/sys/vm/dirty_ratio.

When dirty page ratios exceed thresholds, the system either starts pdflush or blocks writes.

Adjust the ratios according to workload requirements.

Partition

Partitions enable horizontal scalability, high concurrency, and replication. They can be moved across brokers to balance load, and custom partitioning can route messages with the same key to the same partition.

Only one consumer in a consumer group can read a given partition at a time, which simplifies offset management and maintains high throughput. However, too many partitions increase leader election time and can cause LeaderNotAvailableException during broker failures.

Pre‑allocate partitions to avoid key‑partition mismatches.

Limit replica count and spread them across racks when possible.

Ensure clean broker shutdowns to avoid long recovery times.

Producer

Since version 0.8, the producer has been rewritten in Java for better performance. It batches messages into MessageSet, reducing RTT and converting random writes into linear writes.

End‑to‑End compression (GZIP or Snappy) is applied on the producer side; brokers store compressed data and only decompress on the consumer side, maximizing compression efficiency.

Acknowledgment settings control durability: acks=0: No acknowledgment, highest throughput, possible data loss. acks=1: Leader acknowledgment only. acks=-1 (or all): All ISR replicas must acknowledge, providing strongest durability at the cost of latency.

Limit producer thread count to avoid message reordering.

Default request.required.acks in 0.8 is 0.

Consumer

Consumers can operate in two modes via Consumer Groups. The high‑level API depends on Zookeeper and is easier to use; the low‑level API is Zookeeper‑free, offers better performance, and allows custom error handling.

Prefer the low‑level API for fine‑grained error handling, especially when dealing with corrupted data after unclean shutdowns.

How Kafka Handles Massive Message Volumes

Kafka achieves high throughput through batch compression and sending. By keeping messages in memory and using efficient batching, it minimizes JVM GC pauses that would otherwise degrade performance.

Kafka Memory Pool

The client uses a single BufferPool (default 32 MB) that manages reusable ByteBuffer objects of size batch.size (default 16 KB). The pool tracks total, used, and available memory, and reuses buffers to avoid heap allocations.

Definition: BufferPool is the sole buffer manager in a KafkaProducer instance; totalMemory = used + available.

Allocation logic:

int size = Math.max(this.batchSize, Records.LOG_OVERHEAD + Record.recordSize(key, value));
ByteBuffer buffer = free.allocate(size, maxTimeToBlock);

Deallocation logic:

public void deallocate(ByteBuffer buffer, int size) {
    lock.lock();
    try {
        if (size == this.poolableSize && size == buffer.capacity()) {
            buffer.clear();
            this.free.add(buffer);
        } else {
            this.availableMemory += size;
        }
        Condition moreMem = this.waiters.peekFirst();
        if (moreMem != null) moreMem.signal();
    } finally {
        lock.unlock();
    }
}

If messages fit within poolableSize and free buffers are available, the memory is recycled, greatly reducing GC activity. Otherwise, the JVM heap is used, which may trigger GC.

Conclusion

By leveraging a memory buffer pool, sequential disk I/O, PageCache, and Sendfile, Kafka minimizes JVM garbage‑collection overhead, resulting in high‑throughput, low‑latency message delivery suitable for big‑data workloads.

big data Kafka Message Queue GC optimization High Throughput memory buffer

Written by

Java Backend Technology

Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.