How to Build a Reliable, Secure, and Scalable IM Server from Scratch
This article walks through constructing a lightweight instant‑messaging backend, covering version 1.0.0 features, reliability guarantees, application‑level ACK handling, security encryption, database schema for users, relations and offline messages, and storage strategies to prevent duplicate delivery.
Good news: IM 1.0.0 version is released, supporting the following features:
Private chat sending text/files
Sent/delivered/read receipts
LDAP login support
Integration with external authentication systems
Client‑side JAR package for easy development
GitHub link: github.com/yuanrw/IM
This article shows how to build a lightweight IM server from zero, assuming the overall design and architecture have been covered elsewhere.
Reliability
Reliability for an IM system means no message loss , no duplicate messages , and no out‑of‑order delivery .
No Message Loss
The application layer must implement an ACK mechanism similar to TCP, but using message as the unit instead of bytes.
Each sent message waits for an ACK that contains the message ID. The sender keeps the message in a waiting‑ACK queue with a timer; a background thread retries messages that time out.
If the maximum retry count is exceeded, the connection can be closed and the unsent messages stored as offline messages.
No Duplicate / No Out‑of‑Order
Each message carries a unique ID that is unique within a conversation. The receiver tracks the last processed ID ( lastId) and a temporary queue for out‑of‑order messages.
Duplicate detection: msgId > lastId && !queue.contains(msgId). If a duplicate arrives, the receiver simply re‑sends the ACK.
Out‑of‑order handling: messages with IDs greater than lastId + 1 are placed in the temporary queue until missing messages arrive.
Processing flow (illustrated below):
class ProcessMsgNode{</code><code> private Message message;</code><code> private Consumer<Message> consumer;</code><code>}</code><code>public CompletableFuture<Void> offer(Long id, Message message, Consumer<Message> consumer) {</code><code> if (isRepeat(id)) { sendAck(id); return null; }</code><code> if (!isConsist(id)) { notConsistMsgMap.put(id, new ProcessMsgNode(message, consumer)); return null; }</code><code> return process(id, message, consumer);</code><code>}</code><code>private CompletableFuture<Void> process(Long id, Message message, Consumer<Message> consumer) {</code><code> return CompletableFuture.runAsync(() -> consumer.accept(message))</code><code> .thenAccept(v -> sendAck(id))</code><code> .thenAccept(v -> lastId.set(id))</code><code> .thenComposeAsync(v -> {</code><code> Long nextId = nextId(id);</code><code> if (notConsistMsgMap.containsKey(nextId)) {</code><code> ProcessMsgNode node = notConsistMsgMap.get(nextId);</code><code> return process(nextId, node.getMessage(), consumer);</code><code> } else { return CompletableFuture.completedFuture(null); }</code><code> })</code><code> .exceptionally(e -> { logger.error("[process received msg] has error", e); return null; });</code><code>}Security
All chat records and offline messages must be encrypted to protect user privacy.
Two core tables are used: im_user: stores basic user information (username, password, etc.). im_relation: stores friendship relations and an AES key for each pair.
CREATE TABLE `im_relation` (</code><code> `id` bigint(20) COMMENT 'relation id',</code><code> `user_id1` varchar(100) COMMENT 'user 1 id',</code><code> `user_id2` varchar(100) COMMENT 'user 2 id',</code><code> `encrypt_key` char(33) COMMENT 'aes key',</code><code> `gmt_create` timestamp DEFAULT CURRENT_TIMESTAMP,</code><code> `gmt_update` timestamp DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,</code><code> PRIMARY KEY (`id`),</code><code> UNIQUE KEY `USERID1_USERID2` (`user_id1`,`user_id2`)</code><code>);When a client logs in, it retrieves all relations and their encryption keys into memory. Sending a message encrypts the payload with the corresponding key; receiving a message decrypts it using the same key.
Login flow (illustrated below):
Client calls REST API to log in.
Client fetches all relations via REST.
Client sends a greet message to the connector to announce online status.
Connector pulls offline messages and pushes them to the client.
Connector updates the user session.
Sending offline messages before updating the session guarantees that offline messages are delivered before any new messages, preventing out‑of‑order delivery.
Storage Design
Offline Message Storage
An im_offline table stores messages for offline users.
CREATE TABLE `im_offline` (</code><code> `id` int(11) COMMENT 'primary key',</code><code> `msg_id` bigint(20) COMMENT 'message id',</code><code> `msg_type` int(2) COMMENT 'message type (chat/ack)',</code><code> `content` varbinary(5000) COMMENT 'encrypted content',</code><code> `to_user_id` varchar(100) COMMENT 'recipient id',</code><code> `has_read` tinyint(1) COMMENT 'read flag',</code><code> `gmt_create` timestamp COMMENT 'creation time',</code><code> PRIMARY KEY (`id`)</code><code>);When a user comes online, the server queries to_user_id = userId to retrieve pending messages.
Preventing Duplicate Offline Pushes
In multi‑device scenarios, a CAS update ensures each offline message is pushed only once:
update im_offline set has_read = true where id = ${msg_id} and has_read = falseIf the update succeeds, the message is pushed; otherwise it is skipped.
With these mechanisms, you can build a complete, reliable, and secure IM backend.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Programmer DD
A tinkering programmer and author of "Spring Cloud Microservices in Action"
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
