Mastering Message Idempotency: From Simple Checks to State‑Machine Solutions

This article explores the challenges of duplicate message consumption in distributed systems, explains why naive de‑duplication fails under high concurrency, and presents four progressively robust idempotency strategies—from database pessimistic locks and local message tables to a state‑machine approach with Redis or MySQL, highlighting their trade‑offs.

Architecture Digest
Architecture Digest
Architecture Digest
Mastering Message Idempotency: From Simple Checks to State‑Machine Solutions

1. What is Message Idempotency?

In distributed systems, message middleware (MQ) plays a crucial role by handling asynchronous communication, decoupling applications, and smoothing traffic spikes. One of its core guarantees is “at least once” delivery, meaning a message will be consumed at least once.

This leads to a problem: if a consumer processes a message but crashes before acknowledging it, the MQ will consider the message unprocessed and retry delivery, causing possible duplicate consumption.

For non‑idempotent operations such as inserting order data or decrementing inventory, duplicate consumption can cause catastrophic issues like primary‑key conflicts or double inventory deduction. Therefore, ensuring the consumer logic is idempotent—producing the same result no matter how many times it runs—is essential.

2. Why Simple De‑duplication Fails

One intuitive solution is to check whether the business operation has already been performed before executing it. For order processing, the code might look like:

-- 1. Check if order exists
select * from t_order where order_no = 'THIS_ORDER_NO';

-- 2. If not exists, execute business logic
if(order == null) {
    insert into t_order values ...
    update t_inv set count = count-1 where good_id = 'good123';
}

This approach works under low concurrency but suffers from a race condition under high concurrency. If two identical messages arrive within milliseconds, both may see the select result as null and both proceed to execute the business logic, resulting in duplication.

3. How to Achieve Robust Idempotency – Four Evolutionary Solutions

Solution 1: Database Pessimistic Lock (SELECT … FOR UPDATE)

Wrap the SELECT statement with FOR UPDATE inside a transaction, locking the relevant row.

-- 1. Begin transaction
-- 2. Lock record
select * from t_order where order_no = 'THIS_ORDER_NO' for update;

-- 3. Check status and execute
if(order.status == null) {
    // ... business logic ...
}
-- 4. Commit transaction

The first transaction locks the row, causing subsequent duplicate messages to block on the SELECT FOR UPDATE until the first transaction commits. This solves the concurrency issue but enlarges transaction scope, locking business tables and reducing throughput.

Solution 2: Local Message Table (Bind Message Record with Business Transaction)

Create a msg_consumed table to store consumption records and bind “record consumption” and “execute business logic” within the same database transaction.

Begin transaction

Insert a message record into msg_consumed (using business ID or message ID as primary key).

Execute core business logic (e.g., update order table).

Commit transaction

If the transaction succeeds, both the message record and business data are persisted. Subsequent duplicate messages fail to insert due to primary‑key conflict, achieving idempotency. If the transaction rolls back, the message will be retried.

Limitations:

Strong reliance on relational‑database transactions : operations involving RPC calls or Redis that do not support transactions cannot be guaranteed atomic.

Single‑database constraint : cannot solve cross‑database transaction issues.

Solution 3: State‑Machine Idempotency (Non‑transactional)

Design a non‑transactional solution based on a separate “idempotent table” (in MySQL or Redis) that records the processing state of each unique business key.

Core flow:

Query the idempotent record by unique key.

Determine state:

No record → insert a CONSUMING record with a timeout, then execute business logic.

State CONSUMING → another message is already being processed; reject and schedule a delayed retry.

State COMPLETED → business already succeeded; acknowledge the message.

After business execution:

Success → update the record to COMPLETED.

Failure → delete the record so that a retry can re‑enter the flow.

Timeout handling is crucial: if a consumer crashes while a record is in CONSUMING, a periodic task or Redis TTL removes the stale record, allowing the message to be retried.

Solution 4: Choosing the Storage Medium (Redis vs. MySQL)

The state‑machine approach does not depend on transactions, so the storage can be selected based on performance and durability needs.

MySQL : high reliability and persistence, but lower performance and requires extra tasks for timeout cleanup.

Redis : extremely fast and natively supports TTL, perfect for timeout handling, but offers weaker durability.

The choice depends on the specific trade‑off between performance and data consistency required by the business.

4. Pros and Cons – Is This a Silver Bullet?

It is not a silver bullet, but its value is significant.

The state‑machine solution addresses most duplicate‑message scenarios, including broker‑induced retries, upstream duplicate sends, and high‑concurrency “window” issues. It decouples idempotent logic from business code, making it a pluggable component.

However, it cannot guarantee idempotency for multi‑step processes that involve external RPC calls. For example, if a lock‑inventory RPC succeeds but the subsequent database insert fails and the service crashes, the idempotent record may be removed, causing a retry to execute the whole flow again and lock inventory twice.

5. Summary – Building a Complete Idempotency Defense System

Achieving near‑100% idempotency requires a layered “combo”:

Core: adopt the state‑machine scheme to handle the majority of duplicate cases.

Make downstream services themselves idempotent (e.g., inventory lock).

Provide rollback or compensation mechanisms for consumption failures.

Implement graceful shutdown and consumption monitoring to avoid abrupt kills and to handle dead‑letter messages.

With this multi‑layered defense, we can truly tame the “message duplication” beast in complex distributed environments.

distributed-systemsbackend developmentRedisMessage Queueidempotencydatabase locking
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.