Implementing a Lightweight Distributed Scheduling Solution to Replace TBSchedule

To improve stability and reduce costs during high‑traffic events, we replaced the Zookeeper‑dependent TBSchedule framework with a lightweight, Redis‑based distributed scheduler that decentralizes task execution, uses thread pools instead of timers, and supports dynamic scaling and seamless degradation for reliable order processing.

JD Retail Technology
JD Retail Technology
JD Retail Technology
Implementing a Lightweight Distributed Scheduling Solution to Replace TBSchedule

Our project originally used the open‑source distributed scheduling framework TBSchedule, which depends on ZooKeeper; due to ZooKeeper instability and lack of maintenance, the scheduler occasionally suffered task stops, loss, or non‑execution, especially problematic during high‑traffic events like the 618 promotion.

To achieve stable operation with minimal modification cost, we designed a lightweight scheduling solution that can be degraded seamlessly.

Features of the original TBSchedule we leveraged:

Cluster and distributed support

Flexible task sharding

Dynamic service scaling and resource reclamation

Configurable and adjustable thread count per task

Our implementation goals:

Decentralize scheduling so each subsystem manages its own tasks, improving stability.

Replace ZooKeeper with Redis as the registration center and for dynamic sharding.

Use a thread‑pool instead of timer for thread control.

Automatically invoke TBSchedule internal methods to keep existing interfaces unchanged, avoiding code changes in business systems.

Support dynamic thread‑count adjustment to increase processing capacity.

Solution implementation: (image omitted)

Results:

Quick integration via XML configuration and jar dependency.

Core TBSchedule functionalities are fully replicated with no migration cost.

Combined with ducc configuration and an emergency plan, the system supports manual or automatic degradation, ensuring seamless operation during promotions.

Deployed online with stable dynamic sharding, timely heartbeat checks, and on‑demand degradation, preventing order backlog caused by ZooKeeper fluctuations.

Future plans:

Adopt Quartz (or EasyJob) as the trigger to support richer scheduling configurations, fully replacing TBSchedule.

Provide a visual monitoring interface for task execution status.

END

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Distributed SchedulingMicroservicesredisfault tolerance
JD Retail Technology
Written by

JD Retail Technology

Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.