Redesign of the Signal System for Task Scheduling and Dependency Management
This article explains the shortcomings of the legacy signal design in a scheduling platform, outlines four major dependency problems, and presents a newly engineered signal system with modular functions, instance ID generation, competitive priority rules, and state management to reliably support complex cross‑period and parallel job dependencies.
Background
Signals represent information flows, often expressed as functions of time or location, and are crucial in the 58 scheduling system to control task execution order and ensure correct data flow.
Problems with the Old Signal Design
The previous design generated a single signal for any task regardless of execution time, leading to four key issues: tasks spanning multiple days cause dependency chaos; re‑running historical tasks interferes with current workflows; historical dependency chains cannot run in parallel; and cross‑period or fine‑grained (hour/minute) dependencies are unsupported.
New Signal System
The redesigned system introduces modular signal generation that creates distinct signals for different business time windows and improves the generation workflow.
Signal System Functional Modules
Illustrated by diagrams (images omitted), the system consists of five components that together manage signal creation, storage, and consumption.
Underlying Data Structure
Field
Description
job_id
Task identifier
instance_id
Signal instance identifier
execute_id
Execution identifier
compete_priority
Competitive priority
status
Signal status
Key fields discussed are instance_id, compete_priority, and status.
Signal Instance ID
Generated from the scheduling time and period, instance IDs follow formats such as yyyyMMdd for daily cycles, yyyyMMddHH for hourly cycles, and yyyyMMddHHmm for minute cycles, enabling clear separation of cross‑day and cross‑period dependencies.
Competitive Priority
Priority determines which job instance is retained when multiple instances compete for the execution queue. The hierarchy is: Re‑run > System schedule > Single run, encoded as a letter prefix plus timestamp (e.g., X+timestamp for re‑run, C+timestamp for system schedule).
Signal Status
Status transitions (INITED, OK, FAILED) depend on the existence of instance records and priority comparisons; lower‑priority jobs are discarded, and status updates require equal priority.
Practical Implementation
Dependency checks now incorporate dynamically generated upstream instance IDs based on the current job’s schedule, crontab expression, and custom dependency settings. The UI provides default and custom dependency configurations, with custom options allowing selection of specific hours or days for upstream jobs.
Key Considerations
Increasing cross‑departmental dependencies and data re‑run demands higher priority handling to avoid conflicts.
Instance IDs isolate re‑run data from normal schedules.
Parallel‑enabled tasks allow simultaneous execution of different instances; non‑parallel tasks enforce serial execution.
Conclusion
The new signal system resolves the four identified problems, ensuring accurate dependency handling for re‑runs and cross‑period tasks, while highlighting the need for careful configuration to prevent deadlocks or unintended job cancellations.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
58 Tech
Official tech channel of 58, a platform for tech innovation, sharing, and communication.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
