Credit Risk Strategies: From Rule‑Based Scoring to Machine Learning Models
This article presents a comprehensive overview of credit risk control strategies, covering industry background, traditional scoring‑card development, data integration, feature engineering, model evaluation, rate and limit optimization, and advanced machine‑learning approaches for loan underwriting.
Guest speaker Han Shiyuan, Senior Risk Control Director at Baorong Cloud Innovation, shares a complete credit‑risk strategy framework, including the evolution from simple rule‑based decisions to sophisticated data‑driven models.
Background : The consumer‑credit market has moved from rapid growth to a slowdown, with rising household debt and loan‑default rates, prompting a shift from asset‑driven profit to loss‑reduction and cost‑control.
1.1 Consumer Credit Industry Background
The market now faces consumption downgrade, slower retail growth, higher debt ratios, and increasing non‑performing loan ratios, requiring finer‑grained risk control.
1.3 Traditional Scoring‑Card Development Process
1) Define objectives and business rules; 2) Integrate and clean data (personal ID, phone, bank card, transaction records); 3) Engineer features and perform variable binning; 4) Select features using statistical significance, IV, and clustering; 5) Tune models by maximizing KS; 6) Evaluate stability with KS and PSI; 7) Check multicollinearity (VIF>5) and model stability (PSI<0.1).
1.4 Machine‑Learning Model Development Process
Machine‑learning pipelines involve less manual rule intervention, lower interpretability, and focus on hyper‑parameter tuning to avoid over‑fitting.
2.1 Pre‑loan Risk Control Process Design
The goal is to identify high‑risk points (fraud, high‑risk users) while reducing cost and improving efficiency. Example workflows from a major bank illustrate identity verification, blacklist checks, cost‑effective intent verification, and integration of People’s Bank rules with third‑party data.
2.2 Rate and Limit Strategies
After scoring, bad‑loan rates per score segment are used to set appropriate interest rates and credit limits. Formulas relate amount (A), expected return (r), and bad‑loan rate (p) for each segment.
Limit optimization assumes stable bad‑loan ratios across score bands and uses a sigmoid function to replace step functions, adjusting limits based on income, assets, and cash flow.
2.3 Diagnosing Rule Effectiveness
Rejected customers are rescored and compared with approved customers to identify under‑performing rules. Distribution charts for each rule help decide which rules to keep, adjust, or discard.
2.4 Model Construction and Optimization
The basic risk‑model pipeline includes data preparation, feature selection, model training, scoring, and deployment. Iterative improvements address sample bias by incorporating rejected samples using proportionate allocation, simple enhancement, or parcelling techniques.
Parcelling splits rejected customers into good/bad groups per score band, then retrains the model, yielding better performance than a simple two‑stage approach.
In conclusion, the speaker emphasized the importance of combining rules, data, and machine‑learning models, continuously monitoring KS/PSI, and iterating the scoring system to achieve effective pre‑loan risk control.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.