Real‑Time Graph Neural Network for Payment Fraud Detection at eBay

This article describes how eBay applies graph neural networks to real‑time payment fraud detection, covering the anti‑fraud scenario, limitations of traditional GBDT pipelines, challenges of constructing and serving dynamic heterogeneous graphs, the end‑to‑end solution with directed slice graphs and a Lambda‑style architecture, and experimental results comparing GNN with LightGBM.

DataFunTalk
DataFunTalk
DataFunTalk
Real‑Time Graph Neural Network for Payment Fraud Detection at eBay

The talk begins with an overview of eBay's payment fraud landscape, highlighting risk assessment points before, during, and after a transaction and explaining why real‑time detection is critical.

It then outlines the traditional end‑to‑end pipeline: feature engineering for account‑level variables, labeling based on unauthorized transactions, handling severe class imbalance, and training a GBDT model (e.g., LightGBM) that is later deployed for online scoring.

Next, the limitations of tabular models are discussed, emphasizing that relational features (shared addresses, IPs, emails) are naturally expressed as graph edges, which traditional pipelines struggle to capture efficiently.

The core of the presentation focuses on the challenges of deploying GNNs in a real‑time setting: temporal leakage when constructing a bipartite event‑entity graph, high latency of neighbor queries, and the computational cost of deep models.

To address these, a directed dynamic slice graph is introduced, where each time slice forms a sub‑graph and edges are categorized as (1) order‑to‑entity, (2) historical entity‑to‑entity within a time window, and (3) current‑order propagation edges, with “shadow” orders used to prevent future‑information leakage.

A Lambda‑style architecture is then described: offline embedding of entities via GNNs stored in a key‑value store, and online inference that retrieves a small set of relevant embeddings, combines them with GBDT‑encoded features, and passes them through a final GNN layer for risk scoring.

Experimental results compare the proposed GNN pipeline against LightGBM and MLP baselines on a large e‑commerce fraud dataset. GCN‑based models achieve roughly a 25% improvement in accuracy over LightGBM, while GAT does not outperform GCN due to limited hyper‑parameter tuning.

The talk concludes with a summary of the end‑to‑end solution—graph partitioning, dynamic slicing, and decoupled inference—and outlines future directions such as exploring temporal GNNs (e.g., TGN) and more sophisticated graph partitioning strategies.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

machine learningfraud detectionReal-time analyticspayment risk
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.