Design and Implementation of Delayed Messaging in Distributed Systems
This article surveys common delayed‑message solutions—including database, RocksDB, Redis, and open‑source MQs like RocketMQ, Pulsar, and QMQ—explaining their architectures, advantages, drawbacks, and practical considerations for building reliable distributed asynchronous messaging.
Introduction
Delayed (or scheduled) messages refer to messages sent by a producer that should be consumed only after a specified delay or at a particular timestamp, rather than immediately, in distributed asynchronous messaging scenarios.
Such functionality is typically provided at the middleware layer, often built into the MQ itself or offered as a shared infrastructure service.
This article explores common implementation approaches for delayed messages and evaluates their design trade‑offs.
Implementation Schemes
1. External Storage‑Based Solutions
External storage means any storage system introduced besides the MQ’s native store.
The basic pattern separates the MQ from a dedicated delayed‑message module, which persists messages in an external store until they expire, then forwards them to the MQ.
Database (e.g., MySQL)
Using a relational database table to hold delayed messages.
CREATE TABLE `delay_msg` (
`id` bigint unsigned NOT NULL AUTO_INCREMENT,
`delivery_time` DATETIME NOT NULL COMMENT '投递时间',
`payloads` blob COMMENT '消息内容',
PRIMARY KEY (`id`),
KEY `time_index` (`delivery_time`)
);A scheduled thread scans the table at a configurable interval (the minimum delay granularity) and delivers due messages.
Pros: Simple to implement.
Cons: B‑Tree indexes are not optimal for high‑write message workloads.
RocksDB
RocksDB provides an LSM‑tree storage engine better suited for heavy writes. Projects such as DDMQ’s Chronos layer use RocksDB to persist delayed messages, scanning them periodically and forwarding to RocketMQ.
Pros: LSM‑tree handles massive writes efficiently.
Cons: Heavier solution; requires custom replication and fault‑tolerance logic.
Redis
A popular design stores messages in a Redis hash (message pool) and uses multiple ZSETs as delayed queues, each ZSET representing a time slice. Workers periodically scan the ZSETs, pop expired IDs, retrieve the full message from the hash, and forward it.
Pros: ZSET naturally supports delayed queues; in‑memory operations give high performance.
Cons: Managing many ZSETs across nodes can cause uneven load and duplicate processing; distributed locking may be needed.
Timer‑Thread Deficiencies and Improvements
All the above rely on periodic scanning, which wastes resources at low load and can cause inaccurate delays at high load. An improvement is to use a wait‑notify mechanism (similar to JDK Timer): the thread waits until the next message’s delivery time, wakes early if a sooner message arrives, and repeats.
2. Open‑Source MQ Implementations
RocketMQ
RocketMQ supports delayed messages via 18 configurable levels (e.g., 1 s, 5 s, …, 2 h). Messages are stored in a special topic SCHEDULE_TOPIC_XXXX with a queue per level; a broker later forwards them to the target topic.
Pros: Fixed levels keep scheduling overhead low; same‑level messages stay in order; simple append‑only design.
Cons: Level configuration is inflexible; delayed messages increase CommitLog size.
Pulsar
Pulsar allows arbitrary‑time delayed messages. It stores the message in the target topic and maintains a time‑based priority queue in off‑heap memory. The broker checks the queue during consumption and delivers messages when their delay expires.
Cons: High memory consumption (one queue per subscription group), costly recovery after failures, and storage overhead because delayed messages keep the underlying topic data for the whole delay span.
QMQ
QMQ provides flexible arbitrary‑time delayed/scheduled messages using a two‑level hierarchical time wheel: a disk‑based wheel with hour‑level slots and an in‑memory wheel with 500 ms slots. Only the nearest‑hour wheel is loaded into memory, reducing RAM usage.
Design Highlights:
O(1) insertion/deletion via time‑wheel algorithm.
Supports very large delay spans (up to two years) through multi‑level wheels.
Delayed‑load mechanism keeps only imminent messages in memory.
Separate schedule log isolates delayed messages from normal traffic, avoiding storage interference.
Conclusion
The article consolidates prevalent delayed‑message designs, compares their strengths and weaknesses, and offers practical guidance for selecting an appropriate solution in distributed systems.
Architect's Guide
Dedicated to sharing programmer-architect skills—Java backend, system, microservice, and distributed architectures—to help you become a senior architect.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.