Real-time Fraud Detection in E-commerce Payments Using Graph Neural Networks
This article presents an end‑to‑end solution that leverages graph neural networks and dynamic bipartite graph construction to detect payment fraud in eBay's e‑commerce platform in real time, addressing traditional model limitations, graph latency challenges, and demonstrating superior performance over GBDT approaches.
The talk introduces eBay's payment risk scenario, describing how fraud can occur before, during, and after a transaction and why real‑time detection is critical for e‑commerce.
Traditional supervised pipelines rely on extensive feature engineering and GBDT models (e.g., LightGBM) trained on transaction, behavior, and third‑party data, but they struggle with sparse early‑stage signals and cannot capture relational patterns effectively.
To overcome these limits, a heterogeneous bipartite graph is built linking orders (events) to entities such as addresses, IPs, and devices. The graph is time‑sliced and directed to prevent future‑information leakage, as illustrated by the following diagram:
Key challenges include graph construction latency (large neighbor groups causing hundred‑millisecond delays) and online inference latency (feature retrieval and GNN computation). Solutions involve building a directed dynamic slice graph and a Lambda‑architecture where entity embeddings are pre‑computed offline, stored in a key‑value store, and fetched with a single hop during online inference.
The end‑to‑end pipeline consists of (1) a directed dynamic slice graph, (2) a Lambda‑style network that decouples embedding generation from online scoring, and (3) a final GNN layer that combines the order’s raw features (encoded by a pre‑trained GBDT) with retrieved entity embeddings to produce a fraud risk score.
Experiments compare GCN and GAT variants of the GNN against LightGBM on a Kaggle insurance cancellation dataset. GCN achieves roughly a 25% relative improvement in average accuracy over LightGBM, while GAT does not outperform GCN, highlighting the value of temporal graph structures.
The conclusion emphasizes the effectiveness of graph partitioning, dynamic slicing, and one‑hop online inference, and outlines future directions such as exploring Temporal Graph Networks (TGN) and more sophisticated partition strategies.
Q&A sections address the purpose of shadow orders and the meaning of one‑hop versus two‑hop features in the graph context.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.