Design and Implementation of an Online‑Offline Task Scheduling System for Baidu’s Mobile Operations Promotion Platform
The authors redesign Baidu’s Mobile Operations Promotion Platform by separating online business logic from offline warehouse calculations and implementing a custom three‑step online‑offline scheduler that logs operations, orchestrates batch tasks, and dispatches them via TDS, delivering consistent, timely settlement data, reduced errors, and lower maintenance costs.
The Mobile Operations Promotion Platform (OPS) at Baidu handles the end‑to‑end online settlement and budget control for internal mobile app and search user‑growth businesses, serving multiple business lines for over a decade.
As user‑growth business expanded, settlement data volume and approval workload increased, exposing limitations of the legacy architecture: outdated data‑layer technology, lack of an offline data warehouse, and reliance on PHP/Java scripts with MySQL for intermediate storage, leading to data‑tampering risks, audit challenges, and frequent settlement data anomalies that caused business losses.
To address these issues, the authors propose an online‑offline hybrid task‑scheduling solution that separates platform concerns (business data entry and state transitions) from data‑warehouse responsibilities (settlement metric and amount calculations).
The redesign involves domain‑level field separation, a unified dispatch entry via a dedicated scheduling system, and classification of triggering events into platform‑driven and external‑data‑driven categories, each routed through the appropriate scheduler.
The core of the solution is a custom scheduling system that coordinates offline calculation tasks, ensuring data consistency and timeliness while preserving platform flexibility. It operates in three steps: (1) operation logging into MySQL, (2) batch‑oriented task orchestration that generates executable instances and associated offline data files, and (3) task dispatch via TDS OpenAPI with status write‑back to the platform.
Task orchestration includes operation aggregation, task‑instance construction using TaskContext (data range, task type, task tuple), and dependency management—both implicit (pre‑requisite sync tasks) and explicit (business‑group ordering)—to produce a deterministic execution order.
The scheduling component handles execution & state synchronization, task management via batch views or TDS UI, and monitoring with alerts for anomalies requiring manual intervention.
By decoupling online and offline logic, migrating calculation logic to the data warehouse, and unifying task scheduling through TDS, the platform achieves accurate, version‑controlled settlement data, improved timeliness, simplified operations, and reduced maintenance costs, effectively supporting the rapid growth of Baidu’s online user‑growth settlement business.
Baidu Geek Talk
Follow us to discover more Baidu tech insights.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.