Graph Machine Learning for Credit Risk Management: Algorithms, Systems, and Applications at Ant Group
This article presents Ant Group's use of graph machine learning for credit risk management, covering the background of small‑business lending, the proprietary AGL graph learning algorithms and system architecture, and detailed applications such as supply‑chain risk analysis, GMV prediction, and temporal graph‑based credit scoring.
The article introduces the application of graph machine learning in credit risk management, focusing on Ant Group's solutions.
Background of credit risk control: Small‑and‑micro enterprises in China face financing difficulties, especially after the pandemic. Ant Group's Net Business Loan (Wangshang Dai) provides operating loans, and credit risk management is central to the loan lifecycle (pre‑loan, in‑loan, post‑loan).
Credit risk management models: Four model types are described – portrait models, credit models, anti‑fraud models, and post‑loan models – each addressing different aspects of enterprise risk.
Challenges: (1) Sparse client information; (2) Strong temporal‑topological attributes of enterprise relationships; (3) Massive graph data scale (billions of nodes, trillions of edges) and complex offline/online processing scenarios.
Ant's proprietary graph learning system (AGL): AGL aggregates neighbor information to enrich representations of thin‑information customers, introduces multiple graph algorithms for temporal‑topological pattern mining, and supports industrial‑scale data applications.
Algorithms used: Early work employed DeepWalk and node2vec for large‑scale embedding (2017). Later, graph neural networks (GNNs) with adaptive aggregation and denoising functions were introduced. Heterogeneous graph representations and multimodal embeddings were explored from 2018 onward, along with path‑aware GNNs for dynamic topology.
System architecture: The system integrates a graph sampling framework (batch pre‑sampling, interactive sampling, online real‑time sampling), a storage engine (GraphFlat, PHStore, GeaBase, IGraph), and unified graph sample generation (GraphFeature) for training and inference. It handles both offline batch training and online scoring.
Application 1 – Supply‑chain based credit analysis: By mining upstream/downstream relationships in the supply chain, Ant can assess the creditworthiness of distributors, reducing information sparsity and improving risk judgments at scale.
Application 2 – GMV (gross merchandise volume) prediction: A temporal GNN model (Gaia) combines historical GMV, other time‑series features, and static shop attributes, using a feature‑mixing layer, temporal encoding layer, and temporal‑shift attention GNN to address data missingness and time‑shift challenges.
Application 3 – Graph‑based credit risk assessment: Credit default prediction is framed as a node binary‑classification problem, leveraging graph structure for risk propagation. Various graph models (network embedding, heterogeneous graphs, temporal graphs, contrastive learning) are evaluated.
Application 4 – Temporal graph learning for credit risk: A spatio‑temporal GNN captures both structural and temporal changes across discrete time slices, using spatial aggregation (GAT‑like), temporal aggregation (LSTM‑like), and a final fusion layer. Continuous‑time extensions encode edge time intervals and contextual attention.
The article concludes with a summary of Ant Group's research outputs and acknowledges the speaker, Dr. Wang Daixin, an algorithm expert in graph machine learning.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.