Artificial Intelligence 15 min read

Baidu's PGL2.2: A Graph Neural Network Framework, Techniques, and Real‑World Applications

This article introduces Baidu's PGL2.2 graph learning platform, explains graph modeling and message‑passing GNN techniques, details training strategies for small, medium and large graphs, showcases node classification and link‑prediction methods, and describes how the framework is applied in search, recommendation, risk control, and knowledge‑graph competitions.

DataFunSummit

Feb 10, 2022

Baidu's PGL2.2: A Graph Neural Network Framework, Techniques, and Real‑World Applications

Graphs provide a universal language for representing complex relationships in social networks, proteins, recommendation systems, and more. Graph neural networks (GNNs) extend neural networks to graph structures using a message‑passing paradigm.

Baidu developed PGL2.2 on top of the Paddle deep‑learning framework, exposing APIs for graph construction, sampling, tensorization, and various graph models (walk‑based, message‑passing, knowledge‑embedding). The system supports both static and dynamic graph modes, enabling users to define node and edge tensors and customize message flow.

Training strategies are categorized by graph size: small graphs fit in a single GPU and use full‑batch training; medium graphs exceed a single GPU’s memory and employ graph‑partitioning or multi‑GPU chunked training with NCCL synchronization; large graphs adopt mini‑batch sampling with neighbor sampling and distributed execution, often requiring parameter‑server‑based embedding storage.

Key GNN tasks covered include node classification (semi‑supervised, using label propagation and the UniMP algorithm for label‑feature fusion) and link prediction (addressing depth and neighbor‑sampling bottlenecks with relation‑aware embedding propagation and the REP technique). The presented methods achieved state‑of‑the‑art results on Open Graph Benchmark datasets and won the KDDCup 2021 Wiki90M competition.

Industrial applications of PGL span Baidu Search (web‑page quality assessment, anti‑spam), recommendation systems, risk control, traffic prediction in Baidu Maps, and POI retrieval. The framework abstracts graph types (homogeneous, heterogeneous, bipartite), sampling strategies (node2vec, meta‑path), and GNN aggregators, supporting large‑scale sparse features and automatic heterogeneous‑graph extensions such as LightGCN‑relation‑wise.

A short Q&A highlights communication overhead in distributed training, the use of multi‑head attention to mitigate over‑smoothing, and successful integration of GNNs with language models (e.g., ERINESage) for NLP‑related tasks.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Recommendation Systems Graph Neural Networks knowledge graphs Large‑Scale Training PGL2.2

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.