How NetEase Cloud Music Solved Cold‑Start with Large‑Scale Graph Neural Networks

This article explains how NetEase Cloud Music tackled cold‑start recommendation challenges in live streaming by leveraging Baidu's PGL distributed graph learning framework to train massive graph neural networks that transfer user behavior from music domains to live content, achieving significant performance gains.

ITFLY8 Architecture Home
ITFLY8 Architecture Home
ITFLY8 Architecture Home
How NetEase Cloud Music Solved Cold‑Start with Large‑Scale Graph Neural Networks

Recommendation systems are now a standard feature of major internet companies, yet many data sources such as social and information networks exist in non‑Euclidean spaces, making traditional collaborative‑filtering methods based on historical behavior insufficient.

NetEase Cloud Music, a national music app, has shifted from a simple music tool to a content community and faces severe cold‑start problems in its live‑streaming recommendation scenario, where a large proportion of users have never interacted with live streams.

To overcome these challenges, the R&D team selected Baidu's open‑source distributed graph learning framework PGL (built on PaddlePaddle) as the foundation for large‑scale graph neural network training. The graph models heterogeneous entities—songs, DJs, queries, radio IDs—and their relationships (user‑host, user‑song, query‑host) to construct a unified graph.

Using algorithms such as DeepWalk, Metapath2Vec, and GraphSage, the team learned strong graph embeddings for entity IDs. These embeddings enable vector‑based recall, transferring users' historical actions in music, playlists, and Mlog domains to the live‑streaming domain, thereby improving the relevance of recommended hosts.

Cloud Music’s data graph contains billions of edges even after pruning, which exceeds the capacity of many open‑source GNN frameworks. PGL’s native distributed graph storage, distributed sampling, and integration with PaddlePaddle’s Fleet API allow efficient training on “thin” compute nodes without requiring prohibitively expensive hardware.

Experimental results demonstrate that graph‑based recommendation substantially boosts view counts, especially for new users and new hosts, confirming the effectiveness of cross‑domain knowledge transfer and large‑scale distributed training.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AIgraph neural networksDistributed TrainingLarge-Scale GraphPGL
ITFLY8 Architecture Home
Written by

ITFLY8 Architecture Home

ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.