Ensuring Reliable, Ordered, and Duplicate‑Free Messaging in IM Systems

This article explains the stringent reliability requirements of instant messaging—ordered delivery, low latency, no loss, and deduplication—analyzes causes of disorder such as multi‑process and multi‑thread architectures, and presents practical solutions including hash‑based routing, sequential IDs, push‑pull mechanisms, ACK optimization, and distributed ID generation.

Seewo Tech Circle
Seewo Tech Circle
Seewo Tech Circle
Ensuring Reliable, Ordered, and Duplicate‑Free Messaging in IM Systems
This article introduces our implementation of message transmission in IM scenarios to meet the quality requirements of instant messaging.

IM Message Requirements

Overall, IM scenarios demand stricter reliability, higher real‑time performance, and strong ordering guarantees.

Message Ordering

Causes of unordered messages include:

Multi‑process deployment

Multi‑threaded services

Multiple socket channels between components

Message flow across processes and sockets
Message flow across processes and sockets

Because different processes, threads, and sockets may handle messages in varying order, the final delivery order cannot be guaranteed.

Signalling Transmission

Signalling requires extremely high real‑time performance, so the system must forward messages in order under high concurrency.

For unicast messages (A → B), the system ensures that A’s messages always travel the same serial route (same service, same thread, same socket) by applying a consistent hash, as shown below:

Consistent hash routing for unicast
Consistent hash routing for unicast

Broadcast messages to a group use a hash of the group ID, ensuring all members receive messages in the same order.

IM Chat

Chat messages must never be lost; users expect to see every message even after reconnection.

Each incoming message is assigned an auto‑incrementing sequence ID (seqid) and stored in a cache in order, as illustrated:

Message cache with seqid
Message cache with seqid

To deliver messages in order, the server does not push data immediately. Instead, it sends a notification prompting the client to pull messages starting from the client’s maximum received seqid, ensuring ordered retrieval.

Push‑pull mechanism for ordered delivery
Push‑pull mechanism for ordered delivery

Message Delivery Guarantee

Like TCP, the system uses ACKs to confirm receipt. Every pull request carries an ACK for the previously delivered message, which adds extra round‑trips:

ACK per pull request
ACK per pull request

Optimization: the next pull request bundles the ACK for the previous message, reducing interaction overhead.

ACK piggybacked on next pull
ACK piggybacked on next pull

Message Deduplication

Duplication occurs when retransmission is triggered after an ACK failure. The solution splits sending into two phases:

Phase One – Data Transmission : The SDK receives data but does not expose it to the upper layer. Retries may cause duplicate data, but the operation is idempotent.

Phase Two – Consumption Notification : After the SDK confirms data receipt, it sends a consumption notice. Retries may duplicate the notice, but consumption is also idempotent, ensuring each message is processed only once.

Two‑phase sending diagram
Two‑phase sending diagram

Both phases are handled by the SDK, invisible to the business logic.

To avoid duplicate delivery, the client caches the maximum received seqid and pulls newer messages based on it. If the local cache fails, the globally unique msgid can be used for deduplication.

Related Questions

Will IDs run out? Each message has a 64‑bit unsigned msgid and seqid. At a rate of 100,000 messages per second, it would take about 500 million years to exhaust the space.

Why not rely on TCP reliability? TCP guarantees stream reliability, whereas IM requires message‑level reliability and business‑specific guarantees.

How to ensure unique, auto‑incrementing seqid and unique msgid across clusters? A dedicated ID generation service (e.g., Redis‑based or distributed ID algorithms) provides globally unique, monotonically increasing identifiers.

Summary

This article presented a simplified model for handling IM message reliability, covering ordering, loss‑prevention, ACK optimization, and deduplication. Implementing these techniques enables an IM system to meet the stringent reliability requirements of modern instant messaging applications.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

backend designMessage ReliabilityDeduplicationorderingInstant Messaging
Seewo Tech Circle
Written by

Seewo Tech Circle

Seewo Tech Circle

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.