Graph Neural Network Approaches for Internet Financial Fraud Detection
The talk examines how the COVID‑19 pandemic accelerated online financial services and fraud, outlines the challenges of traditional and internet‑based fraud detection, and presents graph neural network solutions—including PC‑GNN and AO‑GNN—demonstrating their effectiveness on real‑world and public datasets while discussing future research directions.
The COVID‑19 pandemic dramatically increased the migration of financial services to online platforms, leading to a surge in internet‑based financial fraud. Statistics from the UK and the US show fraud rates rising by over 30% during the early pandemic period, highlighting the urgent need for more robust detection methods.
Traditional fraud detection faces three main challenges: severe class imbalance, concept drift over time, and unreliable (potentially mislabeled) data. Internet fraud detection compounds these issues with extreme imbalance (fraud rates as low as 0.01%), adversarial attacks that generate out‑of‑distribution samples, and scarce labeled data for new financial products.
Data and methods have evolved from rule‑based systems in the 1980s, through classical machine learning in the 1990s, to deep learning today, where complex, heterogeneous data (text, video, graphs) require models that can learn features automatically. Graph Neural Networks (GNNs) are well‑suited because they can integrate multi‑source heterogeneous data into a graph structure and perform semi‑supervised learning.
Two GNN‑based solutions are introduced:
PC‑GNN (Pick‑and‑Choose GNN) addresses class imbalance by globally sampling nodes to balance categories (Pick) and locally over‑sampling minority‑class nodes while down‑sampling others (Choose). Experiments on Alibaba’s real dataset and public datasets (YelpChi, Amazon) show PC‑GNN improves AUC by 3.6‑5.2% over state‑of‑the‑art baselines.
AO‑GNN (AUC‑Optimized GNN) reformulates AUC maximization as a saddle‑point problem and incorporates a reinforcement‑learning‑based graph topology optimizer to mitigate graph poisoning attacks. AO‑GNN further improves performance over PC‑GNN on the same benchmarks.
Comprehensive experiments demonstrate that both models achieve higher F1‑macro, AUC, and GMean scores compared to GCN, GAT, GraphSAGE, CARE‑GNN, and other recent methods, confirming the advantage of integrating multi‑source data with GNNs for fraud detection.
The presentation concludes with three future research directions: (1) addressing scene‑dependency to enable models to adapt to rapidly changing application contexts; (2) developing defenses against adversarial attacks and dynamic fraud behaviors; and (3) leveraging large‑scale pre‑training on unlabeled behavior data to enhance downstream GNN performance.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.