Big Data 38 min read

Inside Alibaba’s Fuxi DAG 2.0: Boosting Big Data Workloads with Dynamic Scheduling

Alibaba’s Fuxi DAG 2.0 redesign separates logical and physical graphs, introduces dynamic scheduling, unified offline and near‑real‑time execution, and a flexible bubble mode, enabling massive big‑data jobs to run up to five times faster while dramatically reducing resource waste.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
Inside Alibaba’s Fuxi DAG 2.0: Boosting Big Data Workloads with Dynamic Scheduling

Preface

In Alibaba’s massive big‑data ecosystem, the Fuxi system manages tens of thousands of physical machines and millions of CPU/GPU cores, handling tens of millions of daily jobs that process exabyte‑scale data.

Background

1 Fuxi DAG/AM Component

The DAG component (also called Application Master, AM) is the central management point of each distributed job. It coordinates execution, interacts with the Resource Manager for resource allocation, and communicates with the compute engine to collect execution feedback.

Fuxi DAG architecture overview
Fuxi DAG architecture overview

2 Logical and Physical Graphs

Distributed jobs can be described by two graph layers: a logical graph that reflects the user‑desired data‑flow, and a physical graph that maps this logic onto the actual cluster resources (concurrency, data transfer methods, etc.).

Logical vs Physical graph
Logical vs Physical graph

3 Why Upgrade to DAG 2.0?

DAG 1.0, built on a Map‑Reduce‑like model, lacks clear separation between logical and physical graphs, making it hard to support dynamic execution and multiple compute modes. Upgrading to DAG 2.0 enables dynamic adjustments during job execution, better resource utilization, and support for new workloads such as Parameter‑Server‑based AI training.

DAG 2.0 Architecture and Overall Design

1 Dynamic Job Execution

Traditional static plans fix concurrency before submission. DAG 2.0 can adjust concurrency on‑the‑fly based on intermediate data size, reducing wasted CPU cycles and avoiding long‑tail stages.

Dynamic concurrency adjustment
Dynamic concurrency adjustment

2 Unified AM/DAG Execution Framework

DAG 2.0 unifies offline batch jobs and near‑real‑time (smode) jobs on a single code base, allowing mixed “bubble” execution where parts of a job run with gang scheduling while others run on‑demand.

Unified execution model
Unified execution model

1) Unified Offline and Near‑Real‑Time Framework

Offline jobs request resources per node; near‑real‑time jobs use gang scheduling with network‑directed data pipelines. The unified model enables a hybrid “bubble” mode that balances performance and resource usage.

Bubble execution example
Bubble execution example

2) Supporting New Compute Modes

For Parameter‑Server (PS) workloads, DAG 2.0 introduces concurrent edges that allow PS and workers to run simultaneously while still being managed by the AM.

Dynamic PS DAG
Dynamic PS DAG

Integration with Upper‑Level Compute Engines

1 Dynamic Adjustments During Execution

Examples include dynamic concurrency scaling for Map‑Reduce style jobs, predicate push‑down to shrink join input size, and early termination for LIMIT queries.

SELECT * FROM tpch_lineitem WHERE l_orderkey > 0 LIMIT 5;

Dynamic LIMIT optimization reduces mapper/reducer count dramatically, achieving >5× speedup and hundreds‑fold resource savings.

LIMIT optimization performance
LIMIT optimization performance

2 Dynamic Logical Graph Adjustments

Conditional join allows the optimizer to emit two alternative logical plans (map‑join vs sorted‑merge join) and let the AM pick the best at runtime based on data statistics.

Conditional join example
Conditional join example

3 Dynamic Resource Configuration

For PAI TensorFlow jobs, a control node predicts the optimal GPU type and can request the AM to adjust resources on‑the‑fly, improving GPU utilization by over 40%.

GPU dynamic allocation
GPU dynamic allocation

Engineering and Rollout

DAG 2.0’s event‑driven scheduler and state‑machine design deliver superior agility for large‑scale jobs. Benchmarks on MAP‑REDUCE, TPC‑DS, and TPCH show significant reductions in CPU‑time and improved stability during peak traffic (e.g., Double‑11/12).

Performance comparison
Performance comparison

Outlook

DAG 2.0 lays the foundation for the next decade of Alibaba’s compute platform, enabling dual dynamic logical/physical graphs, bubble execution, and seamless integration with emerging engines such as TensorFlow, PyTorch, and new streaming models.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Distributed SystemsBig DataDAGResource ManagementDynamic Scheduling
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.