Designing Scalable Delayed Queues: Kafka, RocketMQ, Redis & More

This article surveys delayed‑queue implementations, comparing Kafka, RocketMQ, and Redis (Redisson) designs, outlining their architectures, pros and cons, integration details, timing‑wheel mechanisms, and practical considerations for building a reliable distributed delayed‑queue service.

Java Interview Crash Guide
Java Interview Crash Guide
Java Interview Crash Guide
Designing Scalable Delayed Queues: Kafka, RocketMQ, Redis & More

Project Background

A delayed queue is a message queue with built‑in delay functionality, needed for several use‑cases in the author’s project.

Technology Options

Kafka

Assumption: The project already uses Kafka for business communication, so a native Kafka delayed‑queue is considered.

Idea: Follow RocketMQ’s delayed‑queue design by creating multiple topics (e.g., delay‑minutes‑1) for different delay intervals. Producers send delayed messages to these topics, and a consumer periodically pulls messages, forwarding them to the real target topic when the delay expires.

Send delayed messages to a dedicated delay topic instead of the target topic.

Write a consumer that polls the delay topic, checks timestamps, and forwards eligible messages.

Solution: Use KafkaConsumer.pause() and resume() to hold consumption until the delay condition is met, moving the offset back as needed.

Drawbacks: High internal complexity, need for health checks, limited flexibility in setting delay times.

RocketMQ

Assumption: Underlying code is fully encapsulated, allowing direct use without worrying about low‑level details.

Principle: RocketMQ defines 18 delay levels; messages are sent with a delayLevel and stored in corresponding queues. A timer periodically scans these queues and moves expired messages to the target queue.

Drawbacks: Requires deep knowledge of the middleware source code, single‑threaded timer may become a bottleneck under high load.

Redis (Redisson)

Assumption: Redisson provides a ready‑made delayed‑queue API ( getBlockingQueue(), getDelayQueue()).

Core Structures:

Delay queue: stores enqueued data.

Blocking queue: holds expired data ready for consumption.

Timeout ZSET: uses scores as expiration timestamps.

Timer Implementation: Uses Redis pub/sub to publish a key when data is added; a client runs a HashedWheelTimer that polls the ZSET, moves expired items to the blocking queue.

Drawbacks: Potential duplicate timer tasks due to pub/sub, concurrency safety concerns, and the need for a Redis cluster when data volume grows.

Redisson‑Based Refactor

Key changes include removing the original sub/pub timer, switching to polling the ZSET head, adding a thread‑pool to reduce timing errors, and using a simple distributed lock (e.g., SETNX) for high‑availability in cluster mode.

Additional modifications:

Provide a unified push topic and pull topic for generic service use.

Overall Execution Flow

Business services send tasks to an entry Kafka topic, which creates a delayed job and stores it in a bucket.

A timer continuously scans buckets; when a job’s delay expires, it is sent to a common Kafka output topic.

Consumers read from the output topic and execute business logic.

The output topic acknowledges each message to guarantee at‑least‑once delivery.

Other Delayed‑Queue Ideas

Reduce timing error by using a thread‑pool for faster bucket checks.

Cluster mode with high‑availability design and timer routing.

Distributed lock for timer code in cluster mode.

Message reliability: ensure at least one consumption; on failure, the message is re‑delivered.

Netty Time Wheel (HashedWheelTimer)

Key parameters:

tickDuration: time span of each slot.

ticksPerWheel: size of the wheel array.

Core loop checks the current bucket, executes tasks whose deadline has passed, then sleeps until the next tick.

long deadline = tickDuration * (tick + 1);
for (;;) {
    final long currentTime = System.nanoTime() - startTime;
    long sleepTimeMs = (deadline - currentTime + 999999) / 1000000;
    if (sleepTimeMs <= 0) {
        return currentTime;
    }
    if (PlatformDependent.isWindows()) {
        sleepTimeMs = sleepTimeMs / 10 * 10;
    }
    try {
        Thread.sleep(sleepTimeMs);
    } catch (InterruptedException ignored) {
        if (WORKER_STATE_UPDATER.get(HashedWheelTimer.this) == WORKER_STATE_SHUTDOWN) {
            return Long.MIN_VALUE;
        }
    }
}

Kafka Time Wheel

Combines a DelayQueue (which internally uses a PriorityQueue) to store buckets; the head bucket is the next task to execute. The underlying API blocks until the bucket expires.

private[this] val reinsert = (timerTaskEntry: TimerTaskEntry) => addTimerTaskEntry(timerTaskEntry)

def advanceClock(timeoutMs: Long): Boolean = {
    var bucket = delayQueue.poll(timeoutMs, TimeUnit.MILLISECONDS)
    if (bucket != null) {
        writeLock.lock()
        try {
            while (bucket != null) {
                timingWheel.advanceClock(bucket.getExpiration())
                bucket.flush(reinsert)
                bucket = delayQueue.poll()
            }
        } finally {
            writeLock.unlock()
        }
        true
    } else {
        false
    }
}

XXL_JOB

Two main threads: scheduleThread scans the DB for tasks due in 5 seconds and puts them into the time‑wheel; ringThread processes tasks whose pointer reaches them, delegating execution via RPC. Persistence and distributed scheduling are handled via a traditional time‑wheel.

Summary

The two core challenges when designing a delayed queue are (1) sorting all delayed tasks and (2) accurately detecting when a task’s execution time arrives. The rest of the design can be adapted to specific business requirements.

Comparison of Implementations

RocketMQ: Uses delay levels (bucket sort) and a for‑loop to find expired jobs.

HashedWheelTimer: Array‑based bucket sort, also iterated with a for‑loop.

Kafka Time Wheel: PriorityQueue (heap sort) with low‑level API Condition.awaitNanos()parkNanos().

Redisson Delayed Queue: ZSET skip‑list implementation, combined with sub/pub and HashedWheelTimer for expiration.

Based on Youzan Delayed Queue: ZSET skip‑list, for‑loop traversal, multiple threads each handling a bucket.

References

https://juejin.cn/post/6845166891225317384

https://juejin.cn/post/6910068006244581390

https://juejin.cn/post/6976412313981026318

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Distributed SystemsKafkadelayed queueTime Wheel
Java Interview Crash Guide
Written by

Java Interview Crash Guide

Dedicated to sharing Java interview Q&A; follow and reply "java" to receive a free premium Java interview guide.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.