Design and Practice of an Online Real-Time Feature System for Intelligent Risk Control
This article presents the concepts, architecture, and practical techniques of an online real‑time feature system used in intelligent risk‑control, covering feature definition, time‑window types, calculation functions, distributed processing, low‑latency storage, and operational challenges in high‑concurrency environments.
In mainstream internet products such as search and recommendation, massive user behavior features are required to capture potential purchase intent and improve experience; similarly, AI‑driven risk‑control engines rely on extensive feature sets to support model algorithms and rule‑based decisions.
The article uses an intelligent risk‑control online feature system as a prototype, introducing the basic concept of a feature (dimension + time window + calculation function) and illustrating four time‑window types—natural, fixed, sliding, and session—with examples.
Common calculation functions (SUM, COUNT, COUNT_DISTINCT, LIST, MAX, MIN, AVG) are described, and a matrix shows how each function maps to different window types.
Early implementations were offline, relying on ETL jobs to populate HBase tables; as traffic grew, this architecture could not meet the demands of high concurrency, low latency, and real‑time decision making.
The online real‑time feature system is designed with five layers: data sources (Kafka and Hive), material extraction, feature computation, feature storage, and feature application. It emphasizes a “提、算、存、用” lifecycle and supports flexible configuration via a data dictionary and Groovy parsing functions.
To handle massive data volume, the system adopts a distributed design separating spout nodes (data pulling) and worker nodes (processing), enabling horizontal scaling and fault tolerance through delayed queues and automatic retry mechanisms.
The feature computation framework implements several operators—Accumulator, Comparator, Delayed Queue, Sequential Queue, Set, and List—each realized with Redis or Spark Streaming, and a custom module (TitanCounter) for sub‑hourly, highly time‑sensitive features.
Low‑latency storage is achieved using a sharded Redis cluster with resource isolation, snapshot mirroring to HBase for offline model training, and size‑capped queues with MD5 hashing to conserve memory.
Finally, the article summarizes the design insights, suggests combining offline and real‑time features for optimal resource usage, and points to future directions such as feature composition, transformation, scheduling, and knowledge graph integration in the security domain.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
58 Tech
Official tech channel of 58, a platform for tech innovation, sharing, and communication.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
