Kafka Idempotent and Transactional Messaging Overview

This article explains how Kafka implements idempotent producers and transactional messaging to achieve exactly‑once semantics, detailing the producer session identifiers, sequence numbers, broker checks, two‑phase commit workflow, consumer isolation levels, and the limitations of atomic reads.

Big Data Technology & Architecture
Big Data Technology & Architecture
Big Data Technology & Architecture
Kafka Idempotent and Transactional Messaging Overview

Kafka introduces idempotent messages to solve duplicate and out‑of‑order issues caused by retries, ensuring that a producer’s writes to a partition within a single session are idempotent and enabling Exactly‑Once semantics.

The implementation is straightforward: each producer obtains a globally unique pid when it starts, and every batch of messages includes an incrementing sequence number. The broker maintains a (pid, seq) mapping in memory and checks the incoming sequence number to classify messages as normal, duplicate, or lost.

new_seq = old_seq + 1: normal message;</code><code>new_seq <= old_seq: duplicate message;</code><code>new_seq > old_seq + 1: message loss;

If a producer receives a clear loss acknowledgment or times out without an ack, it retries the send.

Transactional messaging addresses the need for atomic writes of multiple messages in stream‑processing scenarios, involving the producer, transaction coordinator, broker, group coordinator, and consumer.

Producer : Assigned a fixed TransactionalId that persists across restarts, uses an epoch to identify each “rebirth,” follows idempotent behavior, and adds transaction id and epoch to record batches.

Transaction Coordinator : Implements a two‑phase commit using a special transaction topic for logs, balances load by hashing TransactionalId to a broker, and coordinates commit/abort across participants.

Broker : Writes control messages (commit/abort) interleaved with normal messages, advances the high‑water mark, and may also write transactional offset messages.

Group Coordinator : Records transactional consumer offsets in the offset log; these offsets become visible only after the transaction commits.

Consumer : Filters uncommitted and control messages, making them invisible to the user. Two approaches exist:

Consumer‑side caching (set isolation.level=read_uncommitted) where the consumer buffers messages until a commit or abort is received.

Broker‑side filtering (set isolation.level=read_committed) where only committed messages are delivered.

The transaction flow includes:

FindCoordinatorRequest – locate the transaction coordinator.

InitPidRequest – request a pid and epoch.

Start transaction – local producer state change.

Register partitions (AddPartitionsToTxnRequest) – inform the coordinator of involved partitions.

ProduceRequest – send messages with tid, pid, epoch, and seq.

AddOffsetCommitsToTxnRequest – send consumer offsets to the coordinator.

TxnOffsetCommitRequest – write offsets to the offset log (visible after commit).

EndTxnRequest – coordinator writes PREPARE_COMMIT or PREPARE_ABORT, sends WriteTxnMarkerRequest to all involved brokers, and finally writes COMMITTED or ABORTED markers.

After all commit/abort markers are persisted, the transaction is considered complete and its state can be cleaned up.

While Kafka’s transactional messaging guarantees atomic writes across multiple topics, it does not guarantee atomic reads; failures can lead to partial visibility of transaction data, and stream applications may produce nondeterministic results.

In summary, Kafka provides powerful exactly‑once delivery mechanisms through idempotent producers and transactional messaging, but developers must be aware of its read‑side limitations.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

KafkaTransactional MessagingIdempotent Producer
Big Data Technology & Architecture
Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.