Big Data 16 min read

How Flink Powers Real‑Time Risk Control at HuoLaLa: Architecture and Insights

This article explains Flink's role in HuoLaLa's risk‑control system, covering its background, the Lambda‑style architecture that combines batch and streaming, the real‑time data pipeline, machine‑learning models, and operational safeguards that together enable proactive fraud detection.

Huolala Tech

May 28, 2020

How Flink Powers Real‑Time Risk Control at HuoLaLa: Architecture and Insights

1. Flink Background

In the big‑data domain, processing is divided into batch and streaming to meet different computation needs, forming the classic Lambda architecture. Batch layers handle large historical data with high throughput (e.g., Hadoop, Spark), while streaming layers require low latency for real‑time tasks (e.g., Storm, Flink). Flink is a popular open‑source framework that supports both batch and stream processing, and many large companies such as Alibaba, Meituan, and Didi have migrated from Storm or Spark‑Streaming to Flink.

Compared with Storm's simple spout/bolt model, Flink provides richer predefined operators (Source, Sink, map, flatMap, keyBy, window, etc.) and advanced features like checkpoints, exactly‑once guarantees, multiple state backends (memory, filesystem, RocksDB), and diverse window types.

2. Risk‑Control Business

HuoLaLa’s risk‑control focuses on preventing abnormal transactions and disruptive behaviors. Inspired by a story about a physician treating disease at different stages, the system works in three phases: pre‑risk (prevention), in‑risk (interception), and post‑risk (penalty). Post‑risk analysis relies on offline batch processing, while pre‑risk and in‑risk require fast detection of risk signals.

Key steps include identifying risk precursors (e.g., multiple accounts, short distances between orders, high coupon usage) and quickly detecting suspicious events using machine‑learning models built on these features.

3. Overall Architecture

The architecture follows a Lambda design with real‑time and offline components (see Fig. 4). Core parts:

Kafka Event Stream: Ingests user and driver activity logs (order events, login, clicks, etc.).

Offline Data Warehouse: Daily archived data forming wide tables for drivers, users, and orders, enriched with historical risk records.

Graph Store (HBase): Stores relationship graphs for drivers, users, and orders, providing features for machine‑learning models.

Machine‑Learning Models: Trained on offline data, validated on multiple splits, and deployed with a simulation stage before going live.

Log System: Tracks model decisions and supports iterative model improvement.

Blacklist Store: Asynchronous HBase tables for driver, user, and order blacklists.

The Flink job orchestrates the pipeline in five stages:

Stage 1 – Feature Computation: Calculates real‑time aggregate features using tumbling or sliding windows and persists them.

Stage 2 – Feature Query: Retrieves real‑time and offline graph features.

Stage 3 – Feature Merge & Filtering: Merges batch and stream features (e.g., 30‑day order count) and filters irrelevant events, routing data to rule or model modules.

Stage 4 – Decision: Applies simple rules (e.g., self‑order loops) or model inference to assign a risk probability and generate blacklist entries.

Stage 5 – Output: Writes blacklist records to HBase and logs decisions to Kafka for downstream analysis.

4. Online Application

The system is deployed in HuoLaLa’s production risk‑control, with a monitoring dashboard (Fig. 7) showing daily intercepted risky events. Accuracy and latency are critical; strict release processes, checkpointing, and real‑time alerts (Fig. 8) ensure data quality and timely detection.

5. Summary and Outlook

The Flink‑based real‑time risk‑control platform has been built from scratch, integrating graph databases, machine‑learning models, and a risk‑control engine. Current limitations include incomplete data‑quality checks and potential inconsistencies between batch and stream features. Future work will explore Kappa/Dataflow architectures for unified batch‑stream processing and extend the platform with AI‑driven analyses of images, audio, and text.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Flink Real-time Streaming Big Data Architecture Lambda architecture

Written by

Huolala Tech

Technology reshapes logistics

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.