Implementation Schemes and Design Trade‑offs of Delayed Messages in Distributed Messaging Systems
This article surveys common delayed‑message solutions in distributed messaging, compares implementations based on external storage (MySQL, RocksDB, Redis) and built‑in MQ features (RocketMQ, Pulsar, QMQ), and analyzes their advantages, disadvantages, and design considerations.
Delayed (or scheduled) messages refer to messages sent by a producer that should be consumed only after a specified delay or at a particular timestamp, rather than immediately. In distributed asynchronous messaging scenarios, this functionality is usually provided by the middleware layer, often as a built‑in feature of the MQ or as a separate service.
The article examines several common implementation schemes and evaluates their pros and cons.
1. Schemes Based on External Storage
Database (e.g., MySQL)
The delayed‑message module is separated from the MQ and stores messages in a relational table until they expire, after which a scanning thread delivers them back to the MQ. Example table definition:
CREATE TABLE `delay_msg` (
`id` bigint unsigned NOT NULL AUTO_INCREMENT,
`delivery_time` DATETIME NOT NULL COMMENT '投递时间',
`payloads` blob COMMENT '消息内容',
PRIMARY KEY (`id`),
KEY `time_index` (`delivery_time`)
);Advantages: simple to implement.
Disadvantages: B‑Tree indexes are not optimal for high‑write message workloads.
RocksDB
RocksDB provides an LSM‑tree storage engine that handles massive writes efficiently. Projects such as DDMQ’s Chronos module use RocksDB to persist delayed messages, scanning them periodically and forwarding them to RocketMQ.
Advantages: high write performance due to LSM‑tree.
Disadvantages: heavier solution; developers must implement their own replication and fault‑tolerance logic.
Redis
A Redis‑based design stores messages in a hash (message pool) and uses multiple sorted‑set (ZSET) delayed queues, each representing a time slice. Workers periodically scan the ZSETs for expired messages.
Advantages: ZSET naturally supports delayed queues; in‑memory operations give low latency.
Disadvantages: requires careful concurrency control (e.g., distributed locks) and may face duplicate processing in multi‑node deployments.
2. Issues with Periodic Scanning Threads
Scanning threads waste resources when message volume is low and can cause inaccurate delays when the scan interval is too coarse. A more efficient approach uses a wait‑notify mechanism similar to JDK Timer: the thread waits until the next message’s delivery time, waking early only when a sooner message arrives.
3. Built‑in Delayed‑Message Implementations in Open‑Source MQs
RocketMQ
RocketMQ supports 18 predefined delay levels (e.g., 1 s, 5 s, …, 2 h). Messages are stored in a special topic SCHEDULE_TOPIC_XXXX with a queue per level, ensuring ordering within the same level. A broker periodically moves due messages to their real topics.
Advantages: low overhead, level‑based ordering, simple scheduling.
Disadvantages: fixed levels limit flexibility; delay messages increase the size of the CommitLog.
Pulsar
Pulsar allows arbitrary delay times. Delayed messages are indexed in an off‑heap priority queue per subscription group. When a consumer polls, it checks the queue for due messages and delivers them accordingly.
Drawbacks include high off‑heap memory consumption (one queue per subscription), costly rebuild after broker failure, and storage overhead because delayed messages keep the underlying topic data for the whole delay period.
QMQ
QMQ provides truly arbitrary delay times (up to two years by default) using a two‑level hierarchical time wheel: a disk‑based hour wheel stores schedule logs, and an in‑memory 500 ms wheel loads the nearest hour’s index for fast dispatch.
Key benefits: O(1) insertion/deletion, support for long time spans, memory‑friendly delayed loading, and separate storage for delayed messages that does not affect normal message cleanup.
Conclusion
The article aggregates prevalent delayed‑message designs, discusses their trade‑offs, and aims to help readers choose an appropriate solution for their distributed systems.
References
blog.itpub.net/31555607/viewspace-2672190
www.cnblogs.com/hzmark/p/mq-delay-msg.html
mp.weixin.qq.com/s/_wnwBgZgQhjLP14APlQTkA
github.com/qunarcorp/qmq/blob/master/docs/cn/arch.md
github.com/apache/rocketmq
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.