Big Data 19 min read

Building a Unified Scheduling Center with Apache DolphinScheduler: Lenovo’s Practice

This article details Lenovo’s implementation of a unified scheduling center using Apache DolphinScheduler, covering background requirements, reasons for choosing the platform, architectural evolution, feature enhancements, and practical deployments such as HTTP task parameter passing, Java task plugins, global parameters, and future roadmap.

DataFunSummit

Mar 6, 2023

Building a Unified Scheduling Center with Apache DolphinScheduler: Lenovo’s Practice

1. Background Requirements Lenovo needed a unified scheduling center to manage thousands of diverse timed tasks across backend development, data analysis, and operations, supporting rich task types, lifecycle management, SLA guarantees, and business isolation.

2. Why Choose Apache DolphinScheduler Compared with XXL‑Job and Apache Airflow, DolphinScheduler offers cloud‑native, distributed, visual DAG orchestration, extensive task type support (≈20 types), easy API integration, and strong community activity, making it the preferred choice.

3. DolphinScheduler 2.x Feature Overview Includes DAG visual editing, logical and physical tasks, workflow definition and instances, task instance monitoring, status management, and new 2.x features such as task result parameter passing, workflow lineage, data synchronization components, workflow‑task relationship splitting, and version control.

4. Architecture Evolution The platform evolved from 1.2 (Master/Worker with ZooKeeper queue) to 1.3 (Netty communication, reduced latency) and 2.x (master refactor, SPI plugin design, API‑direct communication). The 2.0 refactor removed distributed locks and optimized master thread pools with event‑driven task handling.

5. Community Development DolphinScheduler has grown rapidly in the Apache ecosystem, with increasing contributors, committers, and community metrics, and supports Kubernetes deployment for cloud‑native scalability.

6. Lenovo’s Practical Deployments

HTTP task parameter passing: added OUT parameters to enable downstream tasks to consume previous task results.

Java task plugin: provided an SDK and annotation‑based executor to migrate XXL‑Job tasks into DolphinScheduler.

Project‑wide global parameters: defined a hierarchy of task custom parameters > workflow global parameters > project global parameters, with launch‑time overrides.

Internal authentication integration: implemented two internal auth mechanisms.

7. 3.x New Features & Roadmap Introduced UI redesign, built‑in data quality checks, task groups for concurrency control, multiple workflow execution strategies (parallel, serial‑wait, etc.), and plans for further pluginization of triggers, removal of ZooKeeper, deeper DataOps integration, and enhanced cloud‑native capabilities.

8. Q&A Discussed Kubernetes support, containerized tasks, and future directions for real‑time streaming workloads.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Workflow bigdata DolphinScheduler Lenovo

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.