How to Prevent Message Loss and Duplication in Kafka and RocketMQ

This article analyzes the root causes of message loss and duplication in modern message queues, explains how synchronous and asynchronous sending affect reliability, compares storage strategies like sync‑flush and broker clustering, and offers concrete idempotent handling techniques using database constraints and Redis.

Architect
Architect
Architect
How to Prevent Message Loss and Duplication in Kafka and RocketMQ

Message loss

Message loss can occur at three points in a typical queue workflow: when the producer sends the message, while the broker stores it, and when the consumer acknowledges it.

Send loss

In synchronous sending the producer blocks until it receives an ACK from the broker. If the ACK is not received within the configured timeout the send is considered failed and must be retried. Asynchronous sending improves throughput but requires a callback to detect failures. Kafka’s asynchronous API is:

producer.send(record, new Callback() {
    public void onCompletion(RecordMetadata metadata, Exception exception) {
        if (exception != null) {
            logger.error("Send failed", exception);
        }
        if (metadata != null) {
            logger.info("Send succeeded");
        }
    }
});

RabbitMQ, RocketMQ and other mainstream brokers provide equivalent callback‑based APIs.

Message storage

Even after the producer receives an ACK, the broker may crash before persisting the record to durable storage, causing loss. Two mitigation techniques are commonly used.

Sync flush

Configure the broker to perform a synchronous disk flush so that the ACK is sent only after the message is safely written to disk. In RocketMQ this is achieved by setting flushDiskType=SYNC_FLUSH.

Sync flush illustration
Sync flush illustration

Broker cluster

Running a single broker creates a single point of failure. A cluster replicates each message to multiple nodes; the producer can be configured to wait for acknowledgments from two or more replicas before returning success. This guarantees that the loss of any single broker does not make the message unavailable.

Broker cluster diagram
Broker cluster diagram

Message consumption

Consumers must acknowledge a message only after the processing logic has completed. If the broker does not receive the ACK it will redeliver the message. Some systems pull a message, immediately ACK it, and process the payload asynchronously. To avoid loss in that pattern the consumer must persist the payload locally (e.g., insert into a database) before sending the ACK.

Message duplication

Duplication originates from two situations: (1) the producer retries because it never received an ACK, and (2) the broker retries because it never received the consumer’s ACK. Duplicate consumption can cause severe business errors such as double payments.

No mainstream queue eliminates duplication entirely; the consumer must implement idempotent handling.

Database unique‑key constraint

If the message is stored in a relational database, define the message identifier as a unique key. An INSERT that violates the unique constraint is rejected, preventing duplicate processing. When the message is not stored directly, any business‑level attribute that uniquely identifies the event can be used as a unique key in a dedicated table.

Save consumption record in Redis

Another common pattern is to record the message ID in Redis and perform a SETNX (set if not exists) before processing. The Java example below demonstrates this approach:

ValueOperations<String, String> ops = redisTemplate.opsForValue();
Boolean first = ops.setIfAbsent(messageId, messageId);
if (first) {
    // business logic
} else {
    logger.error("Message already consumed, skipping, id:{}", messageId);
}

Important: If processing fails after the key has been set, the key must be deleted so that a later retry can succeed.

Summary of mitigation strategies

Prevent loss: Use synchronous sends or reliable asynchronous callbacks, enable synchronous disk flush, and deploy a broker cluster with quorum‑based acknowledgments.

Handle retries: Retries are necessary for loss protection but inevitably introduce the possibility of duplicate delivery.

Prevent duplication: Apply idempotent processing on the consumer side—either via database unique constraints or a Redis SETNX guard.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

BackendKafkaMessage ReliabilityMessage QueueRocketMQduplicate handling
Architect
Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.