Backend Development 16 min read

Implementation and Practice of a Lightweight Service Orchestration Engine for AIGC Video Production

The article systematically introduces the design and implementation of a采编式 (editing‑based) AIGC video production workflow, covering background analysis, service orchestration approaches such as state‑machine scheduling, module and component decomposition with bit‑mask slot management, configuration file definition, and practical flow scheduling using message queues and locks.

Architect
Architect
Architect
Implementation and Practice of a Lightweight Service Orchestration Engine for AIGC Video Production

With the rapid growth of short‑video consumption in China, the need for efficient, semi‑automated video creation tools has become critical; leveraging Baidu's AI capabilities, the authors propose converting article text into video scripts to lower production costs.

The proposed采编式 (editing‑based) video production pipeline consists of five key stages: text processing (content understanding, style identification, subtitle generation), material processing (online retrieval, clipping, cleaning of video and image assets), voice processing (speech synthesis), addition of auxiliary video elements (watermarks, effects, background music), and final video synthesis.

To orchestrate the numerous micro‑services involved, the authors evaluate common service‑orchestration patterns, highlighting a state‑machine based approach that records each service's execution status in a MySQL table and drives the workflow via scheduled tasks or message queues.

They also review mature orchestration engines such as Cadence, Temporal, and Conductor, illustrating Cadence's programming model as an example of how workflow definitions can be expressed.

For their own solution, the system is split into high‑level modules (e.g., script assignment, video generation) and fine‑grained components (e.g., TextProcessor, FootageGenerator, MaterialSearch). Each component is assigned a slot index within a 64‑bit integer, where two bits represent success and failure states; bitwise operations update the slot values without affecting other components.

Process configuration is expressed as a JSON description file that enumerates modules, their statuses, and component details (name, slot_index, success/failure bit masks, dependencies). An excerpt of this configuration is shown below:

{
    "module_name":"ScriptAssign",
    "status":"init",
    "next_status":"generating",
    "components":[
        {
            "component_name":"TextProcessor",
            "slot_index":2,
            "slot_num_success":16,
            "slot_num_fail":32,
            "depends":["TextUnderstanding","WidgetInit"]
        },
        {
            "component_name":"FootageGenerator",
            "slot_index":3,
            "slot_num_success":64,
            "slot_num_fail":128,
            "depends":["TextUnderstanding","WidgetInit","TextProcessor"]
        },
        {
            "component_name":"MaterialSearch",
            "slot_index":4,
            "slot_num_success":256,
            "slot_num_fail":512,
            "depends":["TextUnderstanding","WidgetInit","TextProcessor"]
        }
    ]
}

The workflow execution follows three main steps: (1) task creation – after validation, the task is persisted and a message is sent to the scheduling queue; (2) component execution – the scheduler scans the description file, computes the current slot values, identifies executable components whose dependencies are satisfied, and dispatches them; (3) asynchronous callbacks – micro‑service responses update their respective slot bits and re‑enqueue the task.

Message‑queue‑driven decoupling, combined with per‑component slot updates protected by a distributed lock, ensures correct parallel execution and state consistency.

Since its launch in May 2022, the system has supported five distinct production flows, handling tens of thousands of videos daily; ongoing work focuses on further optimizing the orchestration layer and evaluating mature workflow engines for future stability and performance improvements.

microservicesworkflowState MachineAIGCvideo productionservice orchestration
Architect
Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.