Backend Development 10 min read

Understanding Kafka Producer Memory Pool to Reduce Full GC

This article explains how Kafka's producer uses a memory pool to reuse RecordBatch objects, detailing the allocation and deallocation mechanisms, lock handling, and conditions that prevent full garbage collection pauses, thereby improving throughput and resource efficiency.

Big Data Technology & Architecture

May 6, 2021

Understanding Kafka Producer Memory Pool to Reduce Full GC

In the previous article we described Kafka's eight-step message‑sending process and highlighted that the producer caches messages in memory by partition, storing each partition's messages as batches within a map structure.

The map's key is the partition and its value is a queue of small batches, allowing the producer to send many messages at once instead of one‑by‑one, which saves network resources.

However, this design can cause a problem: when a large number of messages are produced, the JVM may trigger a full garbage collection (full GC) after a batch is sent, pausing all producer threads and halting message delivery.

To mitigate this, Kafka's developers introduced a memory pool that repeatedly reuses RecordBatch objects, reducing the frequency of full GC events.

The memory pool works like a connection pool: it caches reusable batches. Kafka allocates a total of 32 MB for the pool, with each batch sized at 16 KB. The green area in the diagram represents free (unallocated) memory, while the red area shows memory already allocated to batches.

This 32 MB configuration is defined in the ProducerConfig class.

When a producer needs memory, the process starts at step 7 of the overall send flow, entering the RecordAccumulator. If no queue exists, a new one is created and the allocator is invoked.

The allocation logic in BufferPool.allocate proceeds as follows:

(1) If the requested size exceeds the total pool size, an exception is thrown. this.lock.lock(); (2) The method acquires a lock to ensure thread safety.

if (size == poolableSize && !this.free.isEmpty())
    return this.free.pollFirst();

(3) If the request matches the batch size (16 KB) and a free batch exists, it is returned directly.

(4) If enough unallocated memory is available, a new buffer is allocated from the free region and the available memory counter is decreased.

int freeListSize = this.free.size() * this.poolableSize;
if (this.availableMemory + freeListSize >= size) {
    // we have enough unallocated or pooled memory to immediately satisfy the request
    freeUp(size);
    this.availableMemory -= size;
    lock.unlock();
    return ByteBuffer.allocate(size);
}

(5) When the request cannot be satisfied from free memory, a Condition is created, added to a waiting list, and the thread sleeps until memory is released or a timeout occurs.

Condition moreMemory = this.lock.newCondition();
this.waiters.addLast(moreMemory);

(6) If the waiting time exceeds the configured maximum (default 60 s), a TimeoutException is thrown.

if (waitingTimeElapsed) {
    this.waiters.remove(moreMemory);
    throw new TimeoutException("Failed to allocate memory within the configured max blocking time " + maxTimeToBlockMs + " ms.");
}

(7) When another thread releases memory, the waiting thread is signaled, removes itself from the waiters list, and attempts to allocate again, first checking the free list for a 16 KB batch.

if (accumulated == 0 && size == this.poolableSize && !this.free.isEmpty()) {
    // just grab a buffer from the free list
    buffer = this.free.pollFirst();
    accumulated = size;
}

(8) If the request is larger than 16 KB or the free list is empty, the allocator takes whatever memory is available, updates counters, and may loop until the request is satisfied.

int got = (int) Math.min(size - accumulated, this.availableMemory);
this.availableMemory -= got;
accumulated += got;

(9) After successful allocation, if there is remaining free memory, the allocator signals the first waiting thread.

if (this.availableMemory > 0 || !this.free.isEmpty()) {
    if (!this.waiters.isEmpty())
        this.waiters.peekFirst().signal();
}

The deallocation process is straightforward: if the released buffer matches the poolable size (16 KB) and its capacity, it is cleared and returned to the free list; otherwise, its size is added back to the available memory pool, and a waiting thread is signaled.

public void deallocate(ByteBuffer buffer, int size) {
    lock.lock();
    try {
        if (size == this.poolableSize && size == buffer.capacity()) {
            buffer.clear();
            this.free.add(buffer);
        } else {
            this.availableMemory += size;
        }
        Condition moreMem = this.waiters.peekFirst();
        if (moreMem != null)
            moreMem.signal();
    } finally {
        lock.unlock();
    }
}

The design enforces that only 16 KB batches are returned to the allocated‑memory pool to keep memory utilization high; larger batches would waste space because they would often be only partially filled before being sent.

In summary, Kafka's producer employs a memory pool with a ReentrantLock and Condition‑based waiting mechanism to recycle RecordBatch objects, avoid full GC pauses, and maintain efficient memory usage in high‑throughput scenarios.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Java Backend Development concurrency Full GC memory pool

Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.