Artificial Intelligence 18 min read

PaSca: A Scalable Graph Neural Architecture Search System

This article introduces PaSca, a scalable graph neural architecture search framework based on the SGAP paradigm, detailing its problem motivation, methodological innovations, extensive experiments on scalability and prediction performance, and summarizing its open‑source release and impact on large‑scale graph learning.

DataFunTalk
DataFunTalk
DataFunTalk
PaSca: A Scalable Graph Neural Architecture Search System

Problem : Industrial applications demand timely and scalable processing of massive graph data, yet traditional message‑passing GNNs suffer from high communication and computation costs, leading to low scalability and high modeling barriers.

Method : PaSca implements an end‑to‑end searchable system that leverages the Scalable Graph Neural Architecture Paradigm (SGAP), which separates processing into pre‑processing, training, and post‑processing stages, drastically reducing communication rounds and allowing arbitrary downstream models (MLP, DNN, tree models). The system automatically searches for optimal GNN configurations under multiple objectives (accuracy, training time, memory) using Bayesian optimization.

Experiments : Extensive evaluations on standard benchmarks, larger datasets, and industry data demonstrate that SGAP‑based GNNs achieve near‑linear speed‑up and superior trade‑offs between accuracy and inference time compared to traditional message‑passing models such as GraphSAGE. Representative PaSca‑derived architectures (V2, V3) consistently outperform baselines in both scalability and prediction performance.

Conclusion : PaSca successfully automates the design of billion‑node GNNs, has been deployed on Tencent’s Taiji platform for recommendation and risk control, and its codebase (SGL) is open‑source. The framework not only achieves state‑of‑the‑art results on OGB leaderboards but also provides a flexible tool for researchers to explore scalable GNN designs.

machine learningGraph Neural Networksdistributed trainingarchitecture searchPaScaScalable GNNSGAP
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.