Information Security 18 min read

Marketing Anti‑Fraud Algorithm Framework and Practice at 58.com

This article details the design, implementation, and evaluation of a multi‑layer anti‑fraud system for 58.com’s marketing activities, covering data and feature engineering, unsupervised and supervised models, graph‑based community detection, and semi‑supervised graph neural networks, with empirical results demonstrating their effectiveness.

58 Tech
58 Tech
58 Tech
Marketing Anti‑Fraud Algorithm Framework and Practice at 58.com

Background 58.com spends billions on various marketing campaigns such as user acquisition, activation, and promotion. Black‑market actors exploit these incentives by fabricating devices, accounts, or transactions, leading to financial loss and degraded user experience.

Anti‑Fraud Framework Design The overall framework consists of three layers: a data/feature layer that aggregates user, device, IP, and behavior attributes; a model layer that includes unsupervised anomaly detection, semi‑supervised graph neural networks, and supervised tree‑based models; and a service layer that provides offline model + whitelist services and real‑time model APIs for downstream business lines.

Feature Construction Features are categorized into attribute (e.g., registration, authentication, device tags), action (e.g., activity frequency, time‑slot distribution, group similarity), and relation (e.g., shared devices, IPs, community size). These features feed both traditional tree models and graph‑based algorithms.

Unsupervised Anomaly Detection Isolation Forest is the primary algorithm for cold‑start scenarios, isolating outliers via random partitioning. Experiments on a 58 marketing dataset show Isolation Forest achieving higher precision@300 compared with LOF, HBOS, and MCD.

Supervised Models Due to severe class imbalance, ensemble tree models (LightGBM, XGBoost, Random Forest) are employed. LightGBM and XGBoost outperform Random Forest in accuracy while maintaining lower training time and memory consumption.

Graph‑Based Community Detection Fast Unfolding (modularity optimization) is used to partition user graphs into communities. Detected anomalous communities exhibit distinct device characteristics, enabling further feature mining.

Semi‑Supervised Graph Neural Network (GRAND) GRAND combines random propagation and consistency regularization to improve robustness and generalization on partially labeled graphs. Compared with GCN, GRAND achieves superior metrics on black‑sample detection.

Application Results Tables and figures (see images) illustrate the performance gains of Isolation Forest, LightGBM/XGBoost, Fast Unfolding, and GRAND on real 58 marketing data.

Conclusion and Outlook The proposed framework continuously discovers black‑market samples, feeds them to supervised and semi‑supervised models, and serves detections both offline and in real time, effectively reducing economic loss. Future work aims to standardize the pipeline for rapid deployment across similar marketing scenarios and to incorporate newer algorithms for deeper fraud mining.

machine learninganti-fraudinformation securitymarketingunsupervised learninggraph neural network
58 Tech
Written by

58 Tech

Official tech channel of 58, a platform for tech innovation, sharing, and communication.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.