Big Data 18 min read

Achieving Sub‑Second Real‑Time Product Selection with Xianyu’s Mach and Blink

Xianyu’s Mach system tackles the e‑commerce challenge of instantly selecting high‑quality items from billions of products by leveraging Blink’s low‑latency stream computing, detailing its architecture—including state, windows, custom UDX functions, data merging, rule execution, and SQL‑to‑MVEL conversion—to achieve sub‑second processing at massive scale.

ITPUB

Apr 4, 2019

Achieving Sub‑Second Real‑Time Product Selection with Xianyu’s Mach and Blink

Mach System Overview

Mach is a real‑time, high‑performance product selection engine used by Xianyu. It can apply millions of rules to billions of products, marking items within 10 minutes for a 1 million‑rule scale and synchronizing rule hits within one second after a product change.

Stream Computing Fundamentals

Stream computing processes events continuously with low latency. Data is partitioned into windows (tumbling or hopping) and state is maintained across events, allowing immediate output without waiting for batch completion.

Mach uses Blink, Alibaba’s enhanced version of Apache Flink, as its stream‑computing engine. Blink provides high throughput, low latency, stateful computation, strong consistency, and advanced event‑time handling.

Blink State

State stores intermediate results (e.g., aggregation buffers, Kafka offsets). In Mach, state holds merged product data and rule execution outcomes. When a product changes, the new data is merged with the state, rules are re‑executed, and the diff between new and old results yields the effective output.

Blink Window

Windows group data by time. Tumbling windows have a fixed size and trigger once per period (e.g., per minute). Hopping windows have a fixed size but slide more frequently, causing overlapping windows and multiple assignments of the same event.

Blink UDX (User‑Defined Extensions)

UDF : scalar function, one input field, one output field.

UDTF : table‑valued function, one or more input columns, multiple output rows (e.g., STRINGSPLIT, JSONTUPLE).

UDAF : aggregate function, multiple input rows, single aggregated output (e.g., MAX, MIN, AVG).

Mach relies heavily on UDAFs for merging product data, executing rules, and diffing results.

Second‑Level Selection Scheme

Two technical proposals were evaluated:

PostgreSQL‑based solution using functions and triggers. It performed well on small datasets but degraded sharply with millions of rules and billions of products, failing the sub‑second requirement.

Blink‑based solution expressed data‑processing logic in SQL‑like statements and delivered superior performance. This option was selected for production.

Data Processing Module Architecture

Data Ingestion Layer

Connects to multiple channels and data sources.

Parses and validates incoming messages.

Aggregates channel‑wise data volume and monitors trends.

Retrieves field‑level metadata from a central metadata service.

Performs field‑level validation based on metadata.

Normalizes data into Mach’s standard schema.

This decouples source heterogeneity from Blink, allowing a single Kafka topic to feed all data.

Data Merging Layer

Combines the latest product update with the in‑memory snapshot, using timestamps to keep the most recent field values.

Example merge record format: {key:[timestamp,value]} Merge algorithm (illustrated in the diagram) selects the value with the greatest timestamp for each field.

Rule Execution Layer

Consumes merged product data.

Applies field‑level metadata for parsing.

Fetches active rules from the rule center.

Iterates over rules, evaluates each against the product, and stores boolean results in state.

Records exceptions and triggers alerts.

Typical rule: price > 50 AND status == "online". With ~300 rules (growing), each product change may trigger thousands of evaluations, requiring Blink’s massive parallelism.

Result Processing Layer

Retrieves rule execution results for a product.

Filters results based on hit/miss status.

Diffs current results against the previous snapshot stored in Blink state to emit only changed outcomes.

Example diff:

Previous: {"rule1":true,"rule2":true,"rule3":false,"rule4":false}
Current:  {"rule1":true,"rule2":false,"rule3":false,"rule4":true}
Effective output: {"rule2":false,"rule4":true}

Converting Offline SQL Rules to Online MVEL Expressions

Offline rules are stored as SQL WHERE clauses and executed in a relational database. Real‑time rules run as MVEL expressions (Java‑like syntax) inside Blink. The conversion must preserve semantics and performance.

Solution steps:

Parse SQL using Alibaba Druid to obtain an abstract syntax tree and extract the WHERE subtree.

Traverse the subtree in post‑order.

Replace SQL operators with Java equivalents (e.g., AND → &&, OR → ||, = → ==, <> → !=, LIKE → .contains()).

Transform IN clauses into a series of || comparisons.

Example conversion: A LIKE '%test%' becomes A.contains("test").

Performance and Impact

Since launch, Mach has supported nearly 400 campaigns, processing ~140 million messages daily with peak throughput of 50 000 TPS. The design is generic: input and output are MQ messages, making the solution applicable to other real‑time filtering scenarios beyond product selection.

References

Xianyu Real‑Time Selection System: https://mp.weixin.qq.com/s/8ROsZniYD7nIQssC14mn3w

Blink (Flink): https://github.com/apache/flink/tree/blink

PostgreSQL: https://www.postgresql.org/

Druid SQL Parser: https://github.com/alibaba/druid

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

e-commerce Rule Engine Real-time Flink stream processing blink data merging

Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.