Improving Kafka Consumer Design: From Poll‑Then‑Process Loop to a Better Model

This article analyzes the shortcomings of the traditional poll‑then‑process loop in Kafka consumers, explains configuration pitfalls such as max.poll.interval.ms, and proposes a more robust architecture using work queues, a poller, an executor, and an offset manager to achieve at‑most‑once, at‑least‑once, and exactly‑once processing guarantees.

IT Architects Alliance
IT Architects Alliance
IT Architects Alliance
Improving Kafka Consumer Design: From Poll‑Then‑Process Loop to a Better Model

Introduction

Almost all Kafka Consumer tutorials show the same basic code: creating a consumer, subscribing to topics, and entering an infinite poll‑then‑process loop.

KafkaConsumer<String, Payment> consumer = new KafkaConsumer<>(props);
// Subscribe to Kafka topics
consumer.subscribe(topics);
while (true) {
    // Poll Kafka for new messages
    ConsumerRecords<String, String> records = consumer.poll(100);
    // Processing logic
    for (ConsumerRecord<String, String> record : records) {
        doSomething(record);
    }
}

This simple model works but suffers from several issues that will be detailed next.

Problems

Developers may think poll is merely a request for data. If processing is slow, Kafka acts as a buffer, but the interval between polls grows, hitting the default max.poll.interval.ms (5 minutes). If poll isn’t called within this timeout, the consumer is considered dead and a rebalance occurs, adding latency.

max.poll.interval.ms Maximum delay between calls to poll() when using consumer groups. If the timeout expires, the consumer is deemed failed and a rebalance is triggered.

Increasing max.poll.interval.ms isn’t ideal because rebalance time can be up to twice this value. Setting a lower max.poll.records reduces the number of records per poll, shortening the interval, but the fundamental issue remains.

Message Processing Is Asynchronous

Kafka guarantees order only within a partition; messages from different partitions can be processed in parallel. Running as many consumers as partitions maximizes parallelism but adds overhead and more rebalance opportunities.

Using a thread pool to process each record works only when ordering and delivery guarantees are not required.

A Better Model

Overview

The poll‑then‑process loop mixes polling, processing, and offset committing. By separating these concerns we obtain a model that supports parallel processing and back‑pressure.

Work Queues

Work queues act as the communication channel between the poller and the executor:

Each assigned TopicPartition maps one‑to‑one to a work queue. After each poll, the poller pushes new records to the corresponding queue, preserving order. Queues have a configurable size; when full they apply back‑pressure to the poller.

Work queues are asynchronous, decoupling polling from processing, unlike the tightly coupled poll‑then‑process loop.

Poller

The poller encapsulates everything related to polling Kafka:

Monitors rebalance events via ConsumerRebalanceListener and coordinates other components.

Creates a new work queue for each newly assigned TopicPartition and tears down queues for revoked partitions.

Uses a short, configurable interval (e.g., 50 ms) to poll, far lower than max.poll.interval.ms, avoiding rebalance storms.

If an executor cannot keep up, its queue fills, the poller pauses that partition, and resumes it when the queue drains.

pause(Collection<TopicPartition> partitions) Pauses fetching from the given partitions until resume is called.

Executor

The executor is a thread‑pool‑like component that runs workers to process messages:

Number of workers is configurable to match CPU or I/O constraints.

Each queue is handled by a single worker, preserving order within a partition.

A worker may serve multiple queues, enabling parallel processing across partitions.

Offset Manager

Every Kafka record has an offset that serves as a checkpoint. The offset manager is responsible for storing and committing offsets, either to Kafka or an external store, and can operate synchronously or asynchronously.

Manual offset commits (disable enable.auto.commit) give precise control over delivery guarantees.

Commit strategies can be batch‑based, timer‑based, etc.

Automatic commits provide at‑least‑once delivery but no processing guarantees; therefore manual management is preferred.

Processing Guarantees

At‑Most‑Once

Commit the offset before processing the message. If processing fails, the message is lost, but duplicates cannot occur.

At‑Least‑Once

Commit the offset only after successful processing. If a failure occurs before the commit, the message may be re‑processed, ensuring no loss.

Exactly‑Once (External Offset Management)

Perform offset storage and message handling within a single transaction, requiring tight coordination between executor and offset manager.

public void seek(TopicPartition partition, long offset) Overrides the offset that the consumer will use on the next poll.

Summary

We examined the drawbacks of the poll‑then‑process loop and introduced a more suitable model consisting of work queues, a poller, an executor, and an offset manager. While the model is more complex, it provides clearer semantics for parallelism, back‑pressure, and processing guarantees, making it valuable for evaluating existing Kafka client libraries such as Alpakka Kafka, Spring for Kafka, or zio‑kafka.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavaKafkaConsumerPollingProcessing Guarantees
IT Architects Alliance
Written by

IT Architects Alliance

Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.