Backend Development 10 min read

Implementing Task Scheduling Dependencies and Workflow with Go and DAG

This article explains the concepts of task scheduling dependencies and workflow, introduces graph theory basics such as vertices, edges, and DAGs, and provides a complete Go implementation—including graph structures, BFS traversal, topological sorting, and concurrent execution—to efficiently manage dependent tasks in distributed systems.

High Availability Architecture
High Availability Architecture
High Availability Architecture
Implementing Task Scheduling Dependencies and Workflow with Go and DAG

He Peng, an architect and development manager at an internet finance company, shares his deep research on distributed systems and risk control, focusing on the design and implementation of a task scheduling middleware.

Task scheduling often requires handling complex dependencies where a task can only run after its prerequisite tasks finish. Modeling these dependencies as a directed acyclic graph (DAG) enables clear visualization and systematic execution ordering.

The article reviews essential graph concepts: a graph consists of vertices (tasks) and edges (dependency relations). Directed edges represent precedence, while undirected edges denote bidirectional connections. It distinguishes between adjacency matrices and adjacency lists for storing graph structures, explaining when each is appropriate (sparse vs. dense graphs).

In Go, the graph is represented with the following structures:

type DAG struct {
    Vertexs []*Vertex
}

type Vertex struct {
    Key      string
    Value    interface{}
    Parents  []*Vertex
    Children []*Vertex
}

Utility methods add vertices and edges:

func (dag *DAG) AddVertex(v *Vertex) {
    dag.Vertexs = append(dag.Vertexs, v)
}

func (dag *DAG) AddEdge(from, to *Vertex) {
    from.Children = append(from.Children, to)
    to.Parents = append(to.Parents, from)
}

A breadth‑first search (BFS) traverses the graph to produce a topological order. The basic BFS implementation returns a slice of vertices in reverse execution order, while BFSNew groups vertices by layer to support parallel execution of independent tasks.

func BFS(root *Vertex) []*Vertex {
    q := queue.New()
    q.Add(root)
    visited := make(map[string]*Vertex)
    all := make([]*Vertex, 0)
    for q.Length() > 0 {
        // pop, visit, enqueue children
    }
    return all
}

func BFSNew(root *Vertex) [][]*Vertex {
    q := queue.New()
    q.Add(root)
    visited := make(map[string]*Vertex)
    all := make([][]*Vertex, 0)
    // similar logic, but collects each level into a sub‑slice
    return all
}

After obtaining the layered order, tasks are executed. A sequential version simply sleeps for five seconds per task and prints the result. To improve performance, the article introduces a concurrent executor that runs all tasks in the same layer simultaneously using sync.WaitGroup :

func doTasksNew(vertexs []*Vertex) {
    var wg sync.WaitGroup
    for _, v := range vertexs {
        wg.Add(1)
        go func(v *Vertex) {
            defer wg.Done()
            time.Sleep(5 * time.Second)
            fmt.Printf("do %v, result is %v \n", v.Key, v.Value)
        }(v)
    }
    wg.Wait()
}

Benchmarking shows the concurrent approach reduces total execution time from 45 seconds (serial) to about 20 seconds, demonstrating the benefit of layered parallelism while respecting dependency constraints.

The complete source code is available at https://github.com/skyhackvip/dag for further exploration.

DAGworkflowConcurrencyGotask schedulinggraph
High Availability Architecture
Written by

High Availability Architecture

Official account for High Availability Architecture.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.