Big Data 12 min read

Real-time Performance Optimization of the Mahé Selection and Delivery System

By classifying data streams, aggregating large‑scale T+1 records in six‑hour windows, encoding attributes with multi‑value mappings, storing compressed rule‑hit backups, and synchronizing recall tables in real time, Mahé’s selection‑and‑delivery pipeline cut end‑to‑end latency from minutes to seconds, achieving robust second‑level responsiveness.

Xianyu Technology

Apr 22, 2021

Real-time Performance Optimization of the Mahé Selection and Delivery System

Mahé is the selection and delivery system of Xianyu, handling single‑stock items that require immediate feedback across the selection and delivery pipeline. The system aims for second‑level latency, but growing data volume created bottlenecks.

The end‑to‑end data flow consists of three stages: selection data ingestion, rule computation, and product delivery. Real‑time performance was improved by tackling each stage.

Selection Data Real‑time Optimization

Selection data includes basic product info (real‑time via DB change subscription) and derived statistical/prediction data produced by ODPS with hour‑ or day‑level delay. The original approach joined all data daily and pushed it through BLINK and MetaQ, causing latency tied to the slowest upstream source.

To eliminate the delay, data were classified by production cycle. H+1 data (≈10⁴ rows) are read directly by BLINK and sent to the unified ingestion layer. T+1 data (≈10⁸ rows) are also read by BLINK, but aggregated in a 6‑hour sliding window keyed by product ID before forwarding, reducing latency from days to hours while increasing traffic fourfold.

Selection Rule Computation Real‑time Optimization

The rule engine has an offline component (SQL‑based rule execution on the selection wide table) and an online component (re‑evaluation when a product changes). For the offline engine, a multi‑value mapping transformed KV‑style attribute fields into numeric codes, cutting rule‑matching time from 6 minutes to 30 seconds.

The online engine runs in BLINK, performing MERGE (keep latest record) and DIFF (output only changed results). To achieve zero‑downtime upgrades, the engine now stores a compressed backup of the full rule‑hit set in IGRAPH, allowing seamless BLINK version upgrades and reducing end‑to‑end latency from 2 minutes to 2 seconds.

Product Delivery Real‑time Optimization

Delivery retrieves products from a pool based on user ID and pool ID, using both search‑based and algorithmic recall. Algorithmic recall originally relied on T+1 data, conflicting with real‑time goals. Mahé introduced a real‑time sync engine that updates recall tables in IGRAPH, ensuring that both personalized and fallback recall use the latest products.

Conclusion

After the optimizations, Mahé’s end‑to‑end latency dropped dramatically: selection data from T+1 to H+1, rule computation from 6 minutes to 30 seconds, and delivery from 2 minutes to 2 seconds. The system is now more robust and real‑time, laying the groundwork for future product‑level capabilities.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance Optimization Real-time system architecture Big Data data pipeline online processing

Written by

Xianyu Technology

Official account of the Xianyu technology team

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.