Information Security 18 min read

Graph Mining Algorithms for Advertising Traffic Fraud Detection and Platform Engineering

This article presents graph‑based fraud detection techniques for advertising traffic, detailing dense subgraph algorithms such as Fraudar and D‑Cube, their engineering optimizations, real‑world case studies, and the design of a scalable graph‑mining platform for large‑scale security applications.

JD Retail Technology

Dec 31, 2020

Graph Mining Algorithms for Advertising Traffic Fraud Detection and Platform Engineering

The advertising traffic fraud scenario at JD.com includes malicious clicks, impressions, and orders driven by various motives such as competitor attacks, media channel brushing, and data scraping. Fraudsters employ high‑frequency or low‑frequency cheating modes, the latter requiring modeling of associations among many resource entities.

Graph mining, which captures relationships between entities via nodes and edges, is introduced as an effective tool to detect such coordinated fraud. Two unsupervised dense‑subgraph algorithms are highlighted: Fraudar, a bipartite dense‑subgraph detector originally designed for fake followers and reviews, and D‑Cube, a dense‑subtensor detector for k‑uniform hypergraphs that can incorporate additional dimensions such as time.

Fraudar defines edge suspiciousness based on the target node’s in‑degree decay, aggregates node suspiciousness, and uses a greedy removal process to find the subgraph with maximal average suspiciousness. D‑Cube extends this to higher‑order tensors, enabling detection of fraud groups that exhibit temporal clustering.

Practical case studies demonstrate the application of these algorithms: (1) detecting malicious external ad traffic by constructing a user‑ad click bipartite graph and extracting dense subgraphs with Fraudar, identifying tens of thousands of fraudulent users daily; (2) detecting malicious internal search‑ad clicks by building a four‑mode tensor (user, ad, keyword, time) and applying D‑Cube to uncover coordinated black‑market attacks.

A graph‑mining algorithm platform has been engineered to integrate these methods, providing a six‑layer architecture (data, format conversion, core algorithms, scheduling/output, applications) capable of handling billions of nodes and edges. Engineering optimizations for Fraudar include a shared‑leaf‑node priority tree to reduce memory and accelerate minimum‑suspiciousness node retrieval, and neighbor compression storing adjacency lists in 8‑, 16‑, 24‑, or 32‑bit formats, achieving 5‑8× speedup and ten‑fold memory reduction on a 12.5‑billion‑node graph.

The work concludes that graph models are powerful for structuring relational data, and future directions involve designing distributed, scalable unsupervised graph mining algorithms to meet the challenges of big‑data fraud detection.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

fraud detection graph mining advertising traffic D-Cube dense subgraph Fraudar

Written by

JD Retail Technology

Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.