Xianyu's Complex Event Processing (CEP) System Design and Implementation
Xianyu’s Complex Event Processing system, built on Alibaba’s Blink (Flink) and a custom SQL‑like DSL, standardizes event I/O, lets users define sequence, window and aggregation rules, and combines an interactive rule service, SLS source, parser, job manager and MetaQ sink to achieve ~100 k QPS, sub‑second latency, fault‑tolerant, and rule‑to‑production turnaround in about thirty minutes.
Xianyu built a Complex Event Processing (CEP) system to handle growing scenarios such as security governance and marketing, requiring high throughput, low latency, fault tolerance, and rapid development.
The system standardizes event input/output, uses a custom SQL‑like DSL for rule definition, and relies on Alibaba’s Blink (Flink) as the computing engine.
Key components include an Interactive Service for rule submission, a Stream Source from Alibaba Cloud SLS, an EPL Parser that translates DSL into Blink CEP Pattern or SQL, a Job Manager for task lifecycle, and a Sink that writes matching events to MetaQ.
The DSL supports event sequences, windows, filtering, aggregation, and pattern matching, with keywords similar to SQL (RULENAME, EVENT, WHERE, REPEAT, WITHIN, RETURN) and built‑in functions like SUM, COUNT, MAX, MIN, DISTINCT.
Implementation flow: parse DSL with a Calcite‑based custom parser, validate syntax, generate an AST, translate to Blink SQL when only one event type is involved, otherwise to CEP Pattern API, add standard input/output streams, set runtime parameters, and submit via Blink API.
Early results show the DSL improves development speed (rule to production in ~30 minutes), handles ~100 k QPS with 3 CU and ~1 s latency, and leverages Blink’s fault‑tolerance for high reliability.
Xianyu Technology
Official account of the Xianyu technology team
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.