Backend Development 60 min read

Unlocking High‑Performance Delayed Tasks with Sogou’s C++ Workflow Framework

Workflow, Sogou’s open‑source C++ asynchronous framework, tackles the core challenge of efficiently processing massive delayed tasks in distributed systems by integrating a time‑wheel based scheduler, non‑blocking I/O, and lightweight APIs, delivering millisecond‑level precision and dramatically reduced resource consumption.

Deepin Linux

Sep 28, 2025

Unlocking High‑Performance Delayed Tasks with Sogou’s C++ Workflow Framework

Overview

In distributed systems, message middleware, and scheduled jobs, handling massive delayed tasks efficiently is a core challenge. Traditional solutions suffer from thread blocking, low traversal efficiency, and high resource consumption, leading to reduced latency precision or throughput bottlenecks.

The C++ asynchronous framework WorkFlow offers a lightweight, high‑performance solution. It tightly integrates delayed tasks with the global task scheduler and asynchronous chains (SeriesWork) using a time‑wheel data structure, avoiding linear scans and supporting million‑level tasks with millisecond precision while eliminating dedicated timer threads.

WorkFlow Framework Details

What Is WorkFlow?

WorkFlow is an open‑source C++ server engine from Sogou, based on a task‑flow model for asynchronous programming. It encapsulates six asynchronous resources—CPU, GPU, network, disk I/O, timers, and counters—and exposes them via callback interfaces, allowing developers to build complex applications without dealing with low‑level details.

Installation

git clone https://github.com/sogou/workflow.git
sudo apt install -y cmake libssl-dev
mkdir build
cd build
cmake ..
make
sudo make install
sudo ldconfig

HTTP Client Example

#include "workflow/WFHttpServer.h"
#include "workflow/WFHttpUtil.h"
using namespace protocol;
void onHttpResponse(WFHttpTask *task) {
    HttpResponse *resp = task->get_resp();
    if (resp->get_status_code() == 200) {
        const void *body; size_t len;
        resp->get_parsed_body(&body, &len);
        std::cout << "Content: " << std::string((const char*)body, len) << std::endl;
    } else {
        std::cout << "Failed, code: " << resp->get_status_code() << std::endl;
    }
}
int main() {
    WFHttpTask *task = WFTaskFactory::create_http_task("http://www.example.com", 1, 0, onHttpResponse);
    task->start();
    return 0;
}

Unique Features of WorkFlow

Asynchronous Resource Encapsulation

WorkFlow provides unified interfaces for CPU, GPU, network, disk I/O, timers, and counters. Example: a CPU task that performs quicksort:

#include "workflow/WFFacilities.h"
using namespace wf;
int main() {
    WFFacilities::WaitGroup wg(1);
    WFCPUTask *t = WFTaskFactory::create_cpu_task([](WFCPUTask*){
        int data[] = {5,3,8,1,2};
        // quicksort implementation omitted for brevity
    }, [&](WFCPUTask*){ wg.done(); });
    t->start();
    wg.wait();
    return 0;
}

GPU tasks can be created similarly, e.g., a CUDA matrix multiplication kernel wrapped in a GPU task.

Efficient Task Scheduling

WorkFlow introduces sub‑tasks that can be arranged serially or in parallel, enabling fine‑grained control over complex business logic. Example: a user registration flow with sequential username check and database insertion.

// Pseudocode illustrating serial sub‑tasks
WFSubTask *check = WFTaskFactory::create_subtask([](WFSubTask *st){
    if (usernameExists()) st->set_error();
}, nullptr);
WFSubTask *insert = WFTaskFactory::create_subtask([](WFSubTask *st){
    insertUser();
}, nullptr);
check->set_next(insert);
check->start();

Parallel sub‑tasks are used to fetch multiple URLs concurrently and process their content.

Timer Nodes and State‑Driven Execution

Delayed tasks are modeled as "Timer" nodes within the workflow engine. Fixed‑delay nodes (e.g., auto‑confirm after 7 days) and dynamic‑condition nodes (e.g., settlement delay based on user level) are supported. The engine uses a time‑wheel algorithm to achieve O(1) insertion and trigger complexity.

Core Scheduling Algorithms

Time‑wheel algorithm: tasks are placed into circular slots, reducing trigger cost from O(n) to O(1). Multi‑level wheels and dynamic resizing further improve memory usage and scalability.

Sharding (hash‑based distribution): tasks are assigned to engine nodes by business ID modulo node count, achieving >95% load balance and fast failover via distributed locks.

Persistence and Reliability

Two main tables store workflow state: wf_execution (process instance metadata) and wf_delay_task (pending delayed tasks). For small‑scale workloads MySQL InnoDB suffices; for million‑level tasks a sharded MySQL + Redis secondary index architecture is recommended.

Concurrency Control

WorkFlow prevents duplicate execution using database row locks (SELECT … FOR UPDATE) or Redis atomic operations (SETNX). Idempotent APIs (e.g., unique request IDs) ensure safe retries.

Failure Retry Mechanism

Multiple retry strategies are provided: fixed‑interval retries for transient errors, exponential back‑off for rate‑limited services, and manual escalation after max attempts. Context (request body, SQL) is saved for each retry to avoid re‑building state.

Practical Case Studies

E‑commerce Order Fulfillment

A workflow automates order lifecycle: after payment, a logistics listener updates status; a timer node auto‑confirms receipt after 7 days, followed by settlement after 15 days. Using a 1‑minute time‑wheel and 4 shards, order confirmation latency drops from 120 s to 5 s, CPU/memory usage falls 35%, and ops effort reduces 60%.

Government Approval Automation

Flowable is chosen for moderate‑scale approval processes. Timer nodes enforce work‑day deadlines (e.g., 3‑day initial review). Redis caches approver data, while a time‑wheel triggers overdue notifications via email/SMS. Work‑day calculations exclude weekends and holidays, achieving 100% on‑time督办 compliance.

Selection Guide

Business Scale

Task Volume

Recommended Engine

Core Advantage

Small‑to‑medium

≤10k/day

Flowable

Lightweight, fast rollout

Mid‑large enterprise

10k‑1M/day

Activiti 7

Distributed, high‑performance scheduling

Massive distributed

>1M/day

Custom WorkFlow engine

Deeply tuned sharding & storage

Best practices include batch processing of same‑slot timers, asynchronous I/O conversion, and periodic archiving of historic workflow data to keep query latency under 50 ms.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

workflow task scheduling Asynchronous C++

Written by

Deepin Linux

Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.