Building a Serverless Workflow Platform with Knative: Architecture and Implementation
The article describes how the team built a production‑grade serverless workflow platform on Knative by creating four modules—a dashboard, API, operator, and syncer—that generate Kubernetes resources, automate CI/CD, monitor via Prometheus, and aim to cut boilerplate while supporting future extensions such as richer constructs, multi‑language support, and synchronous invocations.
Background : Microservice adoption has brought many benefits but also introduced new complexities such as service discovery, fault tolerance, distributed transactions, and long call chains. While individual services remain simple, overall business complexity shifts to the interactions between services, making the system harder to understand and maintain.
Problem Statement : Developers need to reduce boilerplate code and manage the growing operational overhead introduced by service meshes and other infrastructure components.
Knative Overview : Knative is a Kubernetes‑based serverless framework released by Google in 2018 and now maintained by multiple vendors (Red Hat, Google, IBM, Pivotal, etc.). It abstracts common cloud‑native concerns (deployment, scaling, routing) so developers can focus on business logic. Knative consists of three main components:
Build : Handles CI/CD pipelines (now delegated to Tekton).
Serving : Manages serverless workloads. Key concepts include Service, Route, Configuration, and Revision.
Eventing : Provides an event‑driven architecture based on CloudEvents, supporting brokers, triggers, and channels.
Workflow Module Composition : To turn Knative into a production‑grade serverless workflow platform, the team built four modules:
Workflow‑Dashboard – front‑end UI for workflow design (drag‑and‑drop, list view).
Workflow‑API – back‑end service that generates the required YAML and interacts with Kubernetes.
Workflow‑Operator – watches custom Workflow resources, translates them into Knative Eventing components, and orchestrates the execution.
Workflow‑Syncer – monitors Workflow status, persists it to a database, and updates the UI.
Workflow Execution Flow (example):
Developer creates a workflow via the UI; the definition is sent to Workflow‑API.
Workflow‑API scaffolds a Git project, creates the directory structure (cmd, config, etc.), and generates the initial YAML files.
Developers implement business logic in the generated Go source files (e.g., main.go in the display directory).
CI/CD (GitLab‑CI + KO) builds Docker images for each function, pushes them to a registry, and substitutes image references in the YAML.
Operator applies the final Workflow YAML to the cluster, creating the corresponding Knative Eventing resources.
Syncer watches the Workflow resource, records its status in the database, and reflects the result in the dashboard.
Monitoring & Alerting : The platform uses Prometheus to scrape HTTP metrics exposed at /metrics from each service. Rancher‑integrated Prometheus collects these metrics, which are visualized in Grafana. Alerts are forwarded via webhook to the company’s unified alerting system, providing health visibility for both services and the underlying workflow.
Summary & Outlook : The serverless workflow platform aims to reduce repetitive code and simplify complex business logic through low‑code, configuration‑driven development. Future work includes:
Supporting richer workflow constructs (choices, loops, etc.) beyond Knative’s Sequence and Parallel.
Implementing detailed workflow‑level monitoring and traceability.
Adding multi‑language support (Java, Python, etc.) beyond Go.
Providing synchronous invocation capabilities for use‑cases that require immediate responses.
iQIYI Technical Product Team
The technical product team of iQIYI
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.