Design and Implementation of an Online/Offline Integrated Task Scheduling System for Baidu's Mobile Operations Promotion Platform (OPS)
The paper presents Baidu’s Mobile Operations Promotion Platform redesign, introducing an online‑offline integrated task‑scheduling architecture that partitions settlement fields to the data‑warehouse, records all jobs in a unified MySQL operation table, orchestrates them via Turing Data Studio, and manages dependencies to achieve consistent, auditable, billion‑scale settlement processing.
The Mobile Operations Promotion Platform (OPS) at Baidu handles the end‑to‑end online settlement and budget control for internal mobile app and search services. Rapid business growth, an aging technical stack, and the lack of an offline data warehouse caused data anomalies, audit risks, and performance bottlenecks.
To address these issues, the article proposes a hybrid online‑offline task scheduling solution. The new architecture separates business‑level data entry and status management (online side) from settlement calculations (offline data‑warehouse side), leveraging Turing Data Studio (TDS) for unified task orchestration.
Key redesign points include:
1. Domain‑level field partitioning : The data‑warehouse layer owns settlement‑related fields, while the platform layer manages presentation and status fields.
2. Unified scheduling entry : Both routine and event‑driven jobs are recorded as business operations and dispatched by a dedicated scheduling subsystem.
3. Logical separation of triggers : Business‑driven changes trigger platform‑initiated tasks; external data changes trigger warehouse‑initiated tasks via TDS’s back‑tracking capabilities.
The workflow consists of three stages:
• Operation Insertion : Business actions are written to a MySQL operation table (e.g., anti‑fraud recalculation, bill recompute). The volume is low (<100 ops/month), so no caching layer is introduced.
• Task Orchestration : A batch job aggregates operations by data‑time and task type, creates task instances, and generates a globally incrementing batch ID. Task contexts define data ranges, task types (fixed vs. business‑specific), and task tuples for TDS integration.
• Task Scheduling : Orchestrated tasks are transformed into TDS schemas and launched via TDS OpenAPI. Task status is written back to the platform for monitoring.
Dependency management distinguishes implicit (system‑required) and explicit (business‑required) dependencies, constructing a dependency tree that ensures correct execution order while keeping the system simple (no multi‑upstream dependencies).
Benefits of the redesign include improved data consistency, auditability, reduced operational overhead, and the ability to handle billion‑scale monthly settlement bills with reliable accuracy.
Baidu Tech Salon
Baidu Tech Salon, organized by Baidu's Technology Management Department, is a monthly offline event that shares cutting‑edge tech trends from Baidu and the industry, providing a free platform for mid‑to‑senior engineers to exchange ideas.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.