Step‑by‑Step Design of a Complex Workflow Engine with Multiple Node Types
The article narrates the progressive design of a customizable workflow engine, starting from a simple linked‑list approver chain and evolving through countersign, parallel, conditional, delegation, timeout, progress‑percentage, and script‑hook features, illustrating each stage with node classifications, state definitions, and tree‑based structures.
Level 1
A boss asks for a simple workflow engine. The initial implementation uses a linked‑list of approvers ending with a terminal node.
Add any number of approvers in order to form a chain, then add an end node.
Record the current approver; after approval, move to the next approver.
When the approver reaches the end node, the workflow finishes.
Boss: "Too simplistic."
Level 2
The boss now requires a countersign node. A countersign node is a large node containing many approvers; all must approve before moving on.
The design is changed from a linked list to a tree:
Two node categories: simple nodes (rectangles) and complex nodes (circles).
The whole process is represented by a tree; leaf nodes are simple nodes.
Each simple node contains exactly one approver.
Complex nodes contain several child nodes.
Countersign node: when activated, all child nodes can be approved; the countersign node completes when all children are approved.
Serial node: child nodes must be approved left‑to‑right; the node completes after the last child is approved.
The outermost layer of any workflow is a serial node; its completion means the whole workflow is finished.
Node states are introduced:
Ready – a simple node that can be approved.
Complete – an already approved node.
Future – a node that has not been reached yet.
Waiting – only complex nodes have this state, indicating they are waiting for child approvals.
An example approval flow with a countersign node is shown.
Level 3
The boss now wants parallel nodes. A parallel node is a complex node where any child can be approved; the node finishes as soon as any child reaches the completed state.
A new state Skip is added: when a parallel node has a child whose state is not Ready or Waiting, all sibling nodes and their descendants are set to Skip.
An illustrative example with images is provided.
Level 4
The boss asks for nesting capability, e.g., a countersign node containing a parallel node, which itself contains a complex node, with unlimited nesting depth.
The solution is an infinitely expandable tree structure that can represent arbitrarily complex processes.
Level 5
Conditional nodes are required. The workflow carries a form, and the next branch is chosen based on form values.
A conditional node behaves like a parallel node but only the child nodes whose conditions are satisfied become active.
Level 6
The boss wants two additional approver types: one selected from the form, and another determined by a mapping function based on the initiator (e.g., get_manager("Qian") → "Li" ).
Simple nodes are therefore divided into three categories: fixed approver, form‑derived approver, and function‑derived approver.
Level 7
The boss asks whether a workflow can be rejected backwards, i.e., from later nodes back to earlier ones.
Implementation: only Ready nodes can reject, mirroring the approval rule.
Level 8
Now the boss wants a "reject to previous approver" feature. Because nodes can be nested arbitrarily, determining the previous approvers is complex.
The solution walks back up the tree until it finds a Ready node that contains the target node.
Level 9
The boss requests a generic "reject to any node" capability.
Implementation: repeatedly reject to the previous level until a Ready node that includes the desired target is found.
Level 10
A time‑limit is added to ordinary nodes; if the node is not completed within the allotted time, it is marked as timed‑out.
Level 11
A delegation feature is introduced: if an approver is unsure, they can delegate the task to another person.
Delegation is modeled by creating a parallel node as the parent and adding a sibling node for the delegate; delegation can be nested indefinitely.
Level 12
The boss wants a "cancel delegation" function.
Canceling delegation is the inverse operation of delegation.
If the delegate has already approved, the delegation cannot be cancelled.
Level 13
Pre‑ and post‑conditions are added to each node: a node can only be entered when its pre‑condition is satisfied, and it can only finish when its post‑condition is satisfied.
Level 14
The boss asks for a progress indicator showing the percentage of a running workflow.
The percentage is calculated as the distance from the leftmost node to the rightmost Ready node divided by the total distance from the leftmost to the rightmost node in the tree.
Level 15
Two executable scripts can be attached to each node: one runs when the node starts approval, the other runs after the node is approved.
The author notes that implementing all these features dramatically increased code size.
Afterword
The author reflects on the intense development effort, the eventual sale of the workflow system to large companies, and wishes fellow developers a bug‑free, healthy career.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.