How Ant Group Built an Ultra‑Real‑Time Client Feature Center for Smarter AI
This article examines the challenges of traditional data feature acquisition and presents Ant Group’s ultra‑real‑time client feature center, detailing its architecture, data collection, streaming and script computation, backflow mechanisms, and monitoring to deliver rich, timely, and easy‑to‑use features for AI models.
Challenges of Traditional Feature Acquisition
Traditional methods suffer from latency (minutes), limited richness (missing fine‑grained behaviors), and low usability (hard to modify without app releases), and they consume significant cloud resources for real‑time aggregation.
Ultra‑Real‑Time Client Feature Center
Leveraging the client’s inherent distributed nature, Ant Group built a feature center that provides rich, accurate, and real‑time features.
Overall Architecture
The system consists of three parts: client feature production, client feature backflow, and client feature monitoring.
Client Feature Production
Defines a standard pipeline—data collection, feature processing, and integration publishing—delivering high‑performance, scalable, and dynamic feature acquisition. Uses a unified event processing (UEP) framework with dynamic hooks, aspect‑oriented injection, and built‑in modules to automatically capture user actions across Native, H5, and Mini‑Program stacks, covering visits, clicks, exposures, scrolls, inputs, playback, etc.
Feature Computation
Combines on‑device streaming computation with Python script execution. The lightweight streaming engine parses a DSL to perform slicing, merging, parameter extraction, counting, summation, averaging, and sequence aggregation with millisecond latency. For complex logic, a lightweight Python VM runs scripts, enabling rapid iteration of algorithmic features, multi‑source aggregation, weighting, and on‑device model inference, reducing cloud load and preserving privacy.
Client Feature Backflow
When a feature is needed, a collection task drives the workflow: data capture → backflow → usage. Supports full‑user‑behavior triggers and offers two backflow channels—network request embedding (millisecond latency) and DataHighway (seconds latency). Users select features from the feature center, which are then back‑filled to the cloud and made available via high‑concurrency queries.
Event Triggers
Three trigger types: single events (page open/close, button click, exposure, request), complex events (real‑time stream matching user behavior chains), and intent events (outputs of on‑device models such as visit intent, churn intent).
Feature Monitoring
Monitors computation success rate, performance, and backflow timeliness. Key dimensions include feature lineage (dependency and usage tracking), feature quality (sampling and anomaly detection), and feature latency (per‑feature and overall computation time) to ensure stable business operation and optimal client experience.
Future Outlook
Plans include a low‑code feature development platform to abstract common templates, deeper feature mining on the client side, and expanding scenarios to accelerate and improve intelligent services across more business lines.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alipay Experience Technology
Exploring ultimate user experience and best engineering practices
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
