How Flink Powers Real-Time Variable Pools for FinTech Risk Assessment
This article details how a fintech company leveraged Apache Flink to build a real-time variable pool, covering architecture choices, development efficiency improvements, multi‑stream association optimizations, and operational monitoring, while also discussing future migration to cloud‑native OLAP solutions.
Abstract
This talk, presented by senior data engineer Mu Jiankui at Flink Forward Asia 2024, is organized into three parts: (1) Building a real‑time variable pool with Flink, (2) Architecture selection and development efficiency strategies for Flink stream processing, and (3) Practices for optimizing the real‑time variable pool architecture and multi‑stream joins.
Background and Need for Real‑Time Variables
MicroFinance Technology (微财科技) provides loan services via a mobile app. Loan approval relies on a risk‑evaluation system that consumes models and strategies, which in turn depend heavily on variables as input data. Real‑time variables—derived from live user behavior—are essential for assessing new users (T0) and for handling sudden changes in existing users, where historical data may be insufficient or misleading.
Limitations of the Original Immediate‑Computation Approach
The initial solution fetched user data from databases (MySQL, MongoDB) on each request and computed variables on the fly. This design suffered from low QPS limits, heavy load on front‑end databases, costly index management, and tight coupling between variable computation and database availability, leading to SLA risks.
Adopting Flink Stream Processing
To overcome these issues, the team switched to Flink. Data is first captured into an ODS layer via Flink CDC, then processed by Flink jobs that generate variables and write them to an OLAP engine, shifting the QPS pressure away from transactional databases. Flink CDC also supports GTID‑based binlog sync, enabling rapid failover and decoupling from the core business services.
Architecture Choice: Kappa vs. Lambda
The team evaluated Lambda (batch + real‑time) and Kappa (pure streaming) architectures. Kappa was chosen because variables require 100 % consistency, which Lambda cannot guarantee without maintaining separate batch and streaming pipelines. Flink’s Exactly‑Once semantics further ensures variable correctness.
Improving Development Efficiency
While Kappa simplified the pipeline, developers faced challenges: Flink SQL could not easily restore state for rapid variable updates, and long‑running multi‑stream joins caused state explosion. The solution was a data‑layered approach: an atomic layer built with the DataStream API handles cleaning, enrichment, and multi‑stream joins, while a higher layer uses Flink SQL for rapid variable logic, boosting development speed by roughly 30 %.
Multi‑Stream Join Optimization
Instead of the complex connect API, the team merged streams using union followed by keyBy, reducing redundant state and simplifying code maintenance.
Variable Pool Storage and Monitoring
Computed variables are stored in Doris, chosen for its high‑concurrency point‑lookup and SQL capabilities. An external query service serves real‑time requests and logs queries to Paimon. Hourly offline jobs monitor variable quality (PSI, missing rate, mean, variance) and trigger alerts when thresholds are breached.
Business Impact and Future Outlook
The real‑time variable pool now supports not only risk assessment but also marketing, customer service, and finance decisions across the company. Looking ahead, the team is testing cloud‑native OLAP products such as StarRocks and SelectDB to potentially replace the self‑built Doris cluster for improved stability.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Big Data AI Platform
The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
