Common Issues in Message Queues and Distributed Transaction Solutions
This article explains the typical problems encountered with message queues, such as message loss, duplicate delivery, and distributed transaction handling, and details various solutions including local message tables, MQ‑based transactions, and the specific mechanisms used by RocketMQ, Kafka, and RabbitMQ to ensure reliability and consistency.
Message Queue Common Issues
Distributed Transactions
What Is a Distributed Transaction
When a system evolves from a single machine to a multi‑node distributed architecture, different services must communicate over the network, making traditional reliable method calls impossible and causing data‑synchronisation problems that constitute a classic distributed‑transaction scenario.
In a distributed transaction, participants, transaction‑supporting servers, resource servers and the transaction manager reside on different nodes, and the goal is to guarantee data consistency across those nodes.
Common Distributed‑Transaction Solutions
1. 2PC (Two‑Phase Commit) – strong consistency
2. 3PC (Three‑Phase Commit)
3. TCC (Try‑Confirm‑Cancel) – eventual consistency
4. Saga – eventual consistency
5. Local message table – eventual consistency
6. MQ transaction – eventual consistency
The focus here is on using message queues to achieve distributed consistency; detailed designs of the above schemes are referenced at the end of the article.
Distributed Transaction Based on MQ
Local Message Table – eventual consistency
The producer not only executes its business logic but also writes a record into a local message table. Each record carries a status flag indicating whether the message has been successfully processed.
The business operation and the insertion into the message table are performed within a single transaction, avoiding situations such as business success + transaction message failure or business failure + transaction message success .
Example: an order service and a cart service. The order service completes its order‑creation logic and sends a message to the cart service to clear the purchased items. The cart service listens to the queue, processes the message, and replies with a success or failure acknowledgment. If the acknowledgment is received, the order service marks the transaction as completed; otherwise it retries via a scheduled task. Both sides must implement idempotency to handle possible duplicate messages.
Key operations:
Both producer and consumer must be idempotent.
The producer should periodically scan the message table and resend unprocessed messages to avoid loss.
MQ Transaction – eventual consistency
MQ transaction mechanisms ensure that the local transaction and the message send either both succeed or both fail.
How RocketMQ Handles Transactions
RocketMQ Transaction Overview
RocketMQ uses a two‑phase commit model with a transaction‑check mechanism to improve success rates and data consistency.
Normal transaction commit steps:
Send a half‑message (invisible to consumers).
The MQ server records the message and returns a response.
Based on the server response, the producer decides whether to execute the local transaction.
According to the local transaction result, the server receives a Commit or Rollback; a Commit makes the message visible, a Rollback discards it.
If the server does not receive a Commit or Rollback, it initiates a compensation flow by querying the producer for the transaction status and acting accordingly.
How Kafka Handles Transactions
Kafka transactions guarantee that all messages sent within a transaction are either all committed or all aborted, providing exactly‑once semantics for the read‑process‑write pattern.
Transaction coordinator (part of the broker) records transaction IDs in a special log topic and manages the two‑phase commit:
Producer requests transaction start; coordinator logs the transaction ID.
Producer sends messages; they are stored like normal messages but are filtered by the client until committed.
Producer signals commit or abort. Commit: coordinator writes a PrepareCommit marker, then releases the messages to consumers. Abort: coordinator writes a PrepareAbort marker, and the uncommitted messages are discarded.
How RabbitMQ Handles Transactions
RabbitMQ transactions ensure that a producer’s message reaches the MQ server; they are less commonly used due to performance overhead.
Typical usage:
Enable transaction with channel.txSelect , send messages, and on failure call channel.txRollback then retry; on success call channel.txCommit .
Alternatively, use publisher confirms (confirm mode) where each published message receives a unique delivery tag and the broker sends an acknowledgment (Basic.Ack) once the message is safely stored.
Confirm modes include synchronous, batch, and asynchronous confirms; asynchronous is generally preferred for higher throughput.
Message Loss Prevention
Production Stage
Producer sends messages to the broker; network issues can cause loss.
RabbitMQ Loss‑Prevention Measures
Capture detectable errors and retry.
Use transactions (as described above) to roll back on failure.
Enable publisher confirms ( channel.confirmSelect ) to get acknowledgments from the broker.
Kafka Loss‑Prevention Measures
Kafka relies on broker acknowledgments and replication; producers wait for acknowledgments before considering a message safely persisted.
RocketMQ Loss‑Prevention Measures
Use synchronous send to wait for broker response.
Configure the broker cluster to replicate messages to at least two nodes before acknowledging the producer.
Storage Stage
Under normal operation, a running broker does not lose messages, but crashes can cause loss. Persistence (durable queues, durable exchanges, and message delivery mode=2) mitigates this risk, though it cannot guarantee 100 % safety without replication (e.g., mirrored queues).
Consumption Stage
Consumers should acknowledge messages only after successful business processing to avoid premature loss.
Message Duplicate Delivery
Delivery Guarantees
Message delivery semantics are typically classified as:
At most once – may lose messages.
At least once – no loss but possible duplicates.
Exactly once – no loss and no duplicates (the highest guarantee).
Most MQs provide “at least once” semantics, so consumers must implement idempotency, e.g., using database unique keys, conditional updates, or attaching a unique ID to each message.
Welfare Notice
Join the architecture community for free resources by replying with keywords such as “architecture”, “practice”, “docker”, “plan”, or “Huawei”.
IT Architects Alliance
Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.