Graph Deep Learning for Content Risk Control and APT Detection
This article presents a comprehensive overview of Tencent AI Lab's graph‑based approaches for detecting misinformation and advanced persistent threats, detailing the challenges of modeling news content and social context, the design of the Post‑User Interaction Network (PSIN), experimental results on large multi‑topic datasets, and a novel graph‑pretraining pipeline for APT detection.
The talk introduces the critical problem of content risk control in social networks, emphasizing the difficulty of distinguishing true from false news using only textual cues and the importance of incorporating social context such as user interactions and propagation trees.
Two main challenges are identified: (1) jointly modeling the intrinsic information of a news item and its surrounding social context, and (2) handling multi‑topic fake news detection.
To address these, Tencent proposes a heterogeneous graph representation of news, combining text, propagation trees, and user‑news bipartite graphs, and introduces the Post‑User Interaction Network (PSIN) which separates encoding and feature fusion stages, employs specialized graph neural networks (TreeGAT, R‑GAT, GATv2), and integrates domain‑adaptation via a Gradient Reversal Layer to improve cross‑topic generalization.
Extensive experiments on the newly constructed MC‑Fake dataset (27,155 news items across five topics, 5 million posts, 2 million users, 200 million edges) demonstrate significant gains over baselines in both same‑topic and cross‑topic settings, with ablation studies confirming the contribution of each module and visualizations showing clearer separation of real and fake news.
The second part focuses on Advanced Persistent Threat (APT) detection, outlining traditional anomaly‑based and matching‑based methods and their limitations, and proposing a graph‑pretraining framework that builds system event graphs, caches neighbor information, and learns node embeddings through neighbor and resource‑process prediction tasks.
By aligning query graphs with system event graphs, the approach enables efficient top‑K subgraph matching, BFS‑based expansion, and fine‑tuning on accumulated malicious samples, achieving high recall rates on DARPA APT datasets containing real attack logs.
Overall, the work defines the problem of social‑context‑aware rumor detection, releases a large‑scale benchmark, and presents the PSIN model and a graph‑pretraining pipeline for APT detection, both advancing the state of the art in AI‑driven security and misinformation mitigation.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.