Kafka Producer Idempotency: PID, Sequence Numbers, and Broker Deduplication

Kafka ensures that a producer’s repeated message sends, caused by network glitches or broker failures, result in only one persisted record per partition by using a unique Producer ID, monotonically increasing sequence numbers, and broker-side tracking of the latest committed sequence for each PID‑partition pair.

Mike Chen's Internet Architecture
Mike Chen's Internet Architecture
Mike Chen's Internet Architecture
Kafka Producer Idempotency: PID, Sequence Numbers, and Broker Deduplication

Introduction

Hello, I am mikechen. Kafka is an essential middleware for large‑scale architectures, and this article explains Kafka producer idempotency.

What Is Producer Idempotency?

Idempotency means that an operation yields the same result no matter how many times it is executed. For critical business systems such as payment, order, or inventory, duplicate writes are unacceptable; idempotency guarantees that a message is written only once.

Kafka Producer Idempotency

Kafka producer idempotency ensures that, even if a message is retried due to network jitter or temporary broker failures, the broker persists only a single copy of that message. In other words, sending the same message multiple times has the same effect as sending it once.

Key Components

Producer ID (PID) : Each producer instance obtains a unique identifier from the broker when it starts. This PID remains constant for the lifetime of the producer session, even after restarts, and provides the global uniqueness needed for precise deduplication.

Sequence Number : For each partition, the producer maintains a monotonically increasing counter. Together with the PID, the sequence number uniquely identifies a message. The first message for a PID typically starts at 0, and the counter increments by 1 for each subsequent message, preserving the order of messages from the same producer.

Broker‑Side Storage : For every <PID, Partition> pair, the broker keeps the latest successfully committed sequence number in memory (or more persistent storage, depending on the implementation).

Implementation Principle

The idempotency mechanism relies on the coordinated work of the three components above. The overall flow is:

Assign a unique PID to each producer instance that enables idempotency.

Attach a monotonically increasing sequence number to each message to preserve order.

The broker maintains, for each <PID, Partition>, the most recent committed sequence number.

When a message arrives, the broker compares its PID and sequence number with the stored value; if it is a duplicate, the broker discards it.

The broker also checks sequence continuity to prevent out‑of‑order writes and to detect possible message loss.

Through this mechanism, even if a producer retries sending the same message because of network issues or other failures, the broker can recognize the duplicate and ensure that each message is written only once to the target partition.

Illustrations

Kafka idempotency components diagram
Kafka idempotency components diagram
Kafka idempotency workflow diagram
Kafka idempotency workflow diagram
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

BackendKafkaProducerMessage Deduplication
Mike Chen's Internet Architecture
Written by

Mike Chen's Internet Architecture

Over ten years of BAT architecture experience, shared generously!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.