Ensuring Reliable, Ordered, and Duplicate‑Free Messaging in IM Systems
This article explains the stringent reliability requirements of instant messaging—ordered delivery, low latency, no loss, and deduplication—analyzes causes of disorder such as multi‑process and multi‑thread architectures, and presents practical solutions including hash‑based routing, sequential IDs, push‑pull mechanisms, ACK optimization, and distributed ID generation.
This article introduces our implementation of message transmission in IM scenarios to meet the quality requirements of instant messaging.
IM Message Requirements
Overall, IM scenarios demand stricter reliability, higher real‑time performance, and strong ordering guarantees.
Message Ordering
Causes of unordered messages include:
Multi‑process deployment
Multi‑threaded services
Multiple socket channels between components
Because different processes, threads, and sockets may handle messages in varying order, the final delivery order cannot be guaranteed.
Signalling Transmission
Signalling requires extremely high real‑time performance, so the system must forward messages in order under high concurrency.
For unicast messages (A → B), the system ensures that A’s messages always travel the same serial route (same service, same thread, same socket) by applying a consistent hash, as shown below:
Broadcast messages to a group use a hash of the group ID, ensuring all members receive messages in the same order.
IM Chat
Chat messages must never be lost; users expect to see every message even after reconnection.
Each incoming message is assigned an auto‑incrementing sequence ID (seqid) and stored in a cache in order, as illustrated:
To deliver messages in order, the server does not push data immediately. Instead, it sends a notification prompting the client to pull messages starting from the client’s maximum received seqid, ensuring ordered retrieval.
Message Delivery Guarantee
Like TCP, the system uses ACKs to confirm receipt. Every pull request carries an ACK for the previously delivered message, which adds extra round‑trips:
Optimization: the next pull request bundles the ACK for the previous message, reducing interaction overhead.
Message Deduplication
Duplication occurs when retransmission is triggered after an ACK failure. The solution splits sending into two phases:
Phase One – Data Transmission : The SDK receives data but does not expose it to the upper layer. Retries may cause duplicate data, but the operation is idempotent.
Phase Two – Consumption Notification : After the SDK confirms data receipt, it sends a consumption notice. Retries may duplicate the notice, but consumption is also idempotent, ensuring each message is processed only once.
Both phases are handled by the SDK, invisible to the business logic.
To avoid duplicate delivery, the client caches the maximum received seqid and pulls newer messages based on it. If the local cache fails, the globally unique msgid can be used for deduplication.
Related Questions
Will IDs run out? Each message has a 64‑bit unsigned msgid and seqid. At a rate of 100,000 messages per second, it would take about 500 million years to exhaust the space.
Why not rely on TCP reliability? TCP guarantees stream reliability, whereas IM requires message‑level reliability and business‑specific guarantees.
How to ensure unique, auto‑incrementing seqid and unique msgid across clusters? A dedicated ID generation service (e.g., Redis‑based or distributed ID algorithms) provides globally unique, monotonically increasing identifiers.
Summary
This article presented a simplified model for handling IM message reliability, covering ordering, loss‑prevention, ACK optimization, and deduplication. Implementing these techniques enables an IM system to meet the stringent reliability requirements of modern instant messaging applications.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
