Mastering Data Workflows with DAGs: Scheduling, Configurable UI, and Visual Design
This article explains how to abstract repetitive data‑report tasks into a standardized workflow, describes the core capabilities of scheduling and configuration, shows how to implement DAG‑based visual editors, and compares similar platforms such as n8n and Orange, offering practical code examples and design insights.
Background
In many daily jobs you need to extract data from a database, transform it into a report, and email it to different leaders. Because each leader may need a different view, the repeated steps can be abstracted into a workflow where each step is a functional node, visualized as a DAG and executed on a schedule.
What Is a Workflow? What Is a DAG?
The following sections introduce the concepts of workflow and Directed Acyclic Graph (DAG) as used in the Universe platform, an intelligent data development platform of Guanyuan.
1. Workflow
Universe provides a workflow engine that supports two core capabilities: scheduling and configuration (node definition).
1.1 Scheduling
The platform supports Cron‑style timed scheduling and event‑driven scheduling based on input data changes. Timed scheduling uses the Quartz distributed scheduler.
High availability: visual DAG composition without learning complex platform languages; out‑of‑the‑box scheduling.
Supports sequential, success, and failure scheduling strategies.
Supports minute, hour, day, week, month intervals and can push results to DingTalk, Enterprise WeChat, etc.
High reliability: multi‑master and multi‑worker architecture avoids single‑point failures.
High scalability: custom task types can be added via SDK.
1.1.1 Timed Scheduling
Schedules can be set to run every day at specific times (e.g., 7 am and 9 pm) or at fixed minute intervals.
1.1.2 Event Scheduling
If a workflow depends on a data source, it can automatically trigger when the source is fully updated.
1.2 Configuration
Based on a declarative configuration object, the platform generates an interactive UI to build target objects. The process reads the configuration, renders corresponding components, and aggregates their values into a final object.
1.2.1 Basic Capability
{
name: '',
description: ''
}Corresponding configuration description:
[
{
fieldName: 'name',
label: '名称',
type: 'STRING',
defaultValue: ''
},
{
fieldName: 'description',
label: '描述',
type: 'TEXT',
defaultValue: ''
}
]Generated UI:
1.2.2 Dynamic Capability
Dynamic UI changes based on selected values. Example: selecting a shape (square or circle) shows different property fields.
[
{
fieldName: 'shape',
label: '图形',
type: 'MODEL',
model: {
modelType: 'SELECT',
labels: ['圆', '正方形'],
values: ['circle', 'square']
}
},
{
fieldName: 'radius',
label: '半径',
type: 'NUMBER',
dependsOnMap: { shape: ['circle'] },
defaultValue: 4
},
{
fieldName: 'side',
label: '边长',
type: 'NUMBER',
dependsOnMap: { shape: ['square'] },
defaultValue: 8
}
]1.2.3 Complex Capability
Components can share data through a top‑level shared context. Component 3 writes a value, Component 1 reads it.
const SharedContext = React.createContext({
updateFieldValue: () => {},
getFieldValue: () => {}
});
function Comp1({ definition }) {
const { targetSharedFieldName } = definition.model;
const { updateFieldValue } = useContext(SharedContext);
useEffect(() => {
updateFieldValue(targetSharedFieldName, value);
}, [deps]);
}
function Comp2({ definition }) {
const { from } = definition.model;
const { getFieldValue } = useContext(SharedContext);
const value = getFieldValue(from);
}1.2.4 Service Capability
List‑type components render array objects as tables.
1.2.5 Registration Capability
Users can register custom component types to extend the configuration UI.
[
{
fieldName: 'fieldName1',
label: '组件1',
type: 'MODEL',
model: {
modelType: 'SELECT',
labels: ['圆', '正方形'],
values: ['circle', 'square'],
from: { fieldName: 'disabledFieldName' }
}
},
{
fieldName: 'fieldName2',
label: '组件2',
type: 'NUMBER'
},
{
fieldName: 'fieldName3',
label: '组件3',
type: 'MODEL',
model: {
modelType: 'BOOLEAN',
targetSharedFieldName: 'disabledFieldName'
}
}
];2. DAG
A Directed Acyclic Graph consists of vertices and directed edges where no cycle exists. It is used to visualize workflow dependencies.
To render a DAG we need node information, node positions, edge information, and edit/read state.
Playground (Edit) vs Renderer (Read)
Playground: node dragging, edge add/remove, node add/duplicate, batch operations, auto‑layout, undo.
Renderer: zoom, pan, node click.
Additional abilities include style configuration, auto‑size, custom node/edge rendering, and extensions such as annotations.
Architecture
|- ConfigContext --- configuration layer
|- Playground --- edit layer
|- ResponsiveProvider --- adaptive size
|- Renderer --- read‑only layer
|- Nodes
|- EdgesUsage Examples
Read‑only rendering:
<ConfigContext.Provider value={{node:{width:56,height:56}}}>
<ResponsiveProvider>
<Renderer nodes={nodes} location={location} edges={edges} />
</ResponsiveProvider>
</ConfigContext.Provider>Edit (Playground) rendering:
<ConfigContext.Provider value={{node:{width:60,height:60}}}>
<Playground nodes={nodes} location={location} edges={edges} />
</ConfigContext.Provider>Custom node/edge rendering:
<Renderer nodes={nodes} location={location} edges={edges}>
<Nodes>{props => <CustomNode />}</Nodes>
<Edges>{props => <CustomEdge />}</Edges>
</Renderer>SVG Rendering
The canvas uses svg, nodes are rendered with foreignObject, and edges with path. Edge paths are quadratic Bézier curves calculated from three points (P0, P1, P2, P4).
d = M P0x P0y Q P1x P1y P2x P2y T P4x P4yLayout Algorithms
Automatic layout is powered by dagre with three ranking strategies.
3. Other Workflows
n8n
n8n is an open‑source workflow automation platform supporting event‑driven and Cron‑based scheduling. It integrates over 200 apps but lacks some Chinese services, so custom nodes are often needed.
Orange
Orange is an open‑source visual programming tool for machine learning and data visualization. It builds data‑analysis workflows with a rich toolbox.
4. Summary and Thoughts
Workflows abstract business processes, making them clear and repeatable. Using DAGs to visualize data‑development workflows provides an intuitive view of dependencies. Extending the platform with custom configuration nodes would further increase flexibility, allowing users to design richer pipelines beyond the built‑in components.
References
Apache DolphinScheduler Introduction
Workflow – Wikipedia
Directed Acyclic Graph – Wikipedia
Orange3 GitHub Repository
n8n GitHub Repository
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
