Angel Graph: A High‑Performance Distributed Graph Computing Framework for Intelligent Risk Control
Angel Graph is a high‑performance, fault‑tolerant distributed graph computing framework developed by Tencent, featuring scalable node‑metric, community‑detection, and graph‑neural‑network algorithms optimized for billion‑node, trillion‑edge datasets, and demonstrated through practical applications in intelligent financial risk control.
Angel Graph is Tencent's internally developed distributed graph computing framework designed to meet the performance and reliability demands of large‑scale intelligent risk‑control scenarios. It supports billions of nodes and hundreds of billions of edges, enabling both traditional graph mining and modern graph‑learning workloads.
Framework Architecture : The system is layered, with a lower layer handling heterogeneous data ingestion and resource scheduling on Yarn/Kubernetes, a middle layer providing a distributed parameter‑server (PS) platform, and an upper layer offering high‑level graph operators and algorithms. This design allows seamless integration of Spark‑based graph mining and PyTorch‑based GNN training.
Node‑Metric Algorithms : Angel Graph implements degree, betweenness centrality, PageRank, K‑core, and motif‑based features at scale. Optimizations include caching high‑degree (super) nodes, data compression to halve adjacency‑list size, and communication‑reduction techniques such as partition‑aware pruning and compute‑pushdown to the PS.
Community Detection : The platform provides distributed weak‑connected‑component computation and modularity‑optimizing Louvain variants. To mitigate parallel “community oscillation,” Angel uses probabilistic updates and community‑merge steps. Additional enhancements address label‑bias, negative‑edge handling, and hierarchical graph compression for faster convergence.
Application in Intelligent Risk Control : Real‑world use cases include (1) node‑metric‑driven anomaly detection in payment networks, where motif‑derived features improve AUC by ~1.3 %; and (2) community‑based fraud‑ring discovery, leveraging positive/negative edge information to increase detection accuracy by 11‑22 %.
Q&A Highlights : The session covered the community‑oscillation problem in parallel Louvain, code availability (all algorithms are open‑sourced in the Angel community), and practical tips for scaling to billion‑node graphs.
For more details, the open‑source repository and related documentation can be accessed via the provided QR codes and links.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.