How to Build a Scalable Distributed Timer with Redis and Time Wheel
This article explains the design of a distributed timer service using a time‑wheel data structure stored in Redis, covering application scenarios, required features, architecture components such as access layer, scheduler, worker, and management center, and detailing reliability and performance techniques.
Timer Application Scenarios
Timers are widely used in the Qidian system. Example scenarios include:
TRTC service events may arrive out of order or be lost; a timer per event ensures timeout handling.
Intelligent voice recognition for call recordings requires a timer to wait until the recording is ready.
Features Required for a Distributed Timer
Most developers have built single‑node timers using ordered structures such as STL map. A distributed timer must provide the following capabilities:
Basic operations : support add, update, delete of timers.
Event callback reliability : ensure each timeout event is delivered at least once.
Data reliability : persist timer events to avoid loss.
Real‑time : support second‑level latency and handle large bursts of expirations.
High performance : serve massive concurrent business traffic.
High availability : survive instance failures without service interruption.
Multi‑tenant : allow multiple business units to share the timer service.
Industry Implementations
1) Redis ZSET provides natural ordering and can be used as a timer structure together with a distributed lock. This simple approach cannot scale horizontally under massive timeout loads.
2) Some distributed message queues offer delayed queues but lack delete/modify capabilities.
Open‑source solutions do not fully meet Qidian’s requirements, so a custom distributed timer was designed.
Qidian Distributed Timer Design
Timer Data Structure
The design adopts the time‑wheel concept. A clock cycle and step size are defined; the pointer moves one step each tick and triggers all events on the current slot. Each slot stores the IDs of timer events in a Redis zset. The timer ID maps to business data stored as a Redis string.
Storing the time wheel in Redis avoids the complexity of building a custom persistent structure.
Architecture
1. Access Layer
Parses timer protocol and provides basic operations (add, delete, update).
2. Scheduler
Coordinates multiple tenants, fetches expired slots from Redis, splits large batches into sub‑tasks, and pushes them to a task queue. It also handles machine‑clock drift by looking back up to five seconds.
3. Worker
Consumes tasks, retrieves timer IDs and user data from Redis, and publishes timeout events to Kafka. It ensures at‑least‑once delivery by using Kafka’s acks=-1 and asynchronous callbacks, employs Redis pipelines for low latency, and uses Lua scripts to guarantee exclusive task claim.
4. Management Center
Manages clusters, app IDs, Kafka topics, and other configuration. Provides a UI for monitoring and administration.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Tencent Qidian Tech Team
Official account of Tencent Qidian R&D team, dedicated to sharing and discussing technology for enterprise SaaS scenarios.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
