Implementing a Lightweight Distributed Scheduling Solution to Replace TBSchedule
To improve stability and reduce costs during high‑traffic events, we replaced the Zookeeper‑dependent TBSchedule framework with a lightweight, Redis‑based distributed scheduler that decentralizes task execution, uses thread pools instead of timers, and supports dynamic scaling and seamless degradation for reliable order processing.
Our project originally used the open‑source distributed scheduling framework TBSchedule, which depends on ZooKeeper; due to ZooKeeper instability and lack of maintenance, the scheduler occasionally suffered task stops, loss, or non‑execution, especially problematic during high‑traffic events like the 618 promotion.
To achieve stable operation with minimal modification cost, we designed a lightweight scheduling solution that can be degraded seamlessly.
Features of the original TBSchedule we leveraged:
Cluster and distributed support
Flexible task sharding
Dynamic service scaling and resource reclamation
Configurable and adjustable thread count per task
Our implementation goals:
Decentralize scheduling so each subsystem manages its own tasks, improving stability.
Replace ZooKeeper with Redis as the registration center and for dynamic sharding.
Use a thread‑pool instead of timer for thread control.
Automatically invoke TBSchedule internal methods to keep existing interfaces unchanged, avoiding code changes in business systems.
Support dynamic thread‑count adjustment to increase processing capacity.
Solution implementation: (image omitted)
Results:
Quick integration via XML configuration and jar dependency.
Core TBSchedule functionalities are fully replicated with no migration cost.
Combined with ducc configuration and an emergency plan, the system supports manual or automatic degradation, ensuring seamless operation during promotions.
Deployed online with stable dynamic sharding, timely heartbeat checks, and on‑demand degradation, preventing order backlog caused by ZooKeeper fluctuations.
Future plans:
Adopt Quartz (or EasyJob) as the trigger to support richer scheduling configurations, fully replacing TBSchedule.
Provide a visual monitoring interface for task execution status.
END
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
JD Retail Technology
Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
