Artificial Intelligence 14 min read

Applying Automated Feature Engineering and Auto Modeling to Risk Control Scenarios

This article explains how automated feature engineering and auto‑modeling techniques dramatically reduce development time and improve performance in fraud‑risk detection, detailing the underlying RFM concepts, feature generation workflow, model selection, evaluation, deployment, and continuous monitoring within a risk‑control platform.

DataFunTalk

Jan 21, 2019

Applying Automated Feature Engineering and Auto Modeling to Risk Control Scenarios

1. Background and Problem

Model development in risk control traditionally follows a multi‑step pipeline (business analysis, data preparation, feature engineering, model building, evaluation, monitoring). Feature engineering and model construction consume the majority of time—about 60% and 30% respectively—making rapid model delivery difficult.

Rong360 introduced an automated feature‑engineering and auto‑modeling solution that abstracts the most time‑consuming steps into a unified tool, improving efficiency, standardization, and model quality while shortening the end‑to‑end cycle to roughly five days.

2. Automated Feature Engineering

Manual feature engineering relies on domain knowledge and is labor‑intensive. Automated methods leverage the RFM (Recency, Frequency, Monetary) model to generate statistical and trend features from transaction‑level data, and can also construct network‑based features (e.g., number of first‑degree contacts, their borrowing behavior).

The automated pipeline first aggregates basic statistics per variable, then derives ratio and trend features in a second layer, producing a rich feature set with minimal manual effort.

3. Automated Modeling

Popular algorithms such as XGBoost, LightGBM, and Logistic Regression (LR) are integrated into the platform. The tool performs automatic EDA‑based feature filtering (high missing rate, low variance, instability), followed by IV screening, tree‑model importance, and collinearity checks, reducing thousands of features to a few hundred high‑value ones.

The LR section explains odds, probability, and the scoring card formula, emphasizing the critical role of WOE binning, which is also automated (equal‑frequency binning with monotonicity checks) while allowing manual adjustments.

For XGBoost, continuous variables are binned to reduce over‑fitting, and categorical variables are encoded (label, one‑hot, etc.). The platform also provides automated hyper‑parameter tuning via GridSearch and RandomSearch.

4. Automated Model Evaluation and Monitoring

Beyond traditional metrics (AUC, KS), the system evaluates models across dimensions (feature, model), samples (train, test, OOT, early‑performance), and versions (current vs. new). It monitors feature drift, PSI, ranking stability, and triggers alerts when deviations exceed thresholds.

5. Model Deployment and Online Monitoring

Instead of hand‑written Python scripts, models are packaged as configuration files that the Rong360 deployment platform consumes, enabling rapid rollout and automatic scoring. Post‑deployment, the platform continuously compares live metrics (KS, AUC, PSI) against baseline and checks feature binning consistency to detect data drift.

Author Introduction

Jiang Hong, Head of Risk‑Control Business Modeling at Rong360, holds a degree from Shanghai Jiao‑Tong University and has extensive experience in credit modeling, fraud detection, and data mining.

Job Opportunities

Rong360 is hiring senior data algorithm engineers (machine‑learning), senior risk‑control algorithm engineers, and senior data analysts in Beijing. Contact: [email protected].

End of Article

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

machine learning fraud detection risk control automated feature engineering auto modeling

Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.