Exploring Tencent Music's Knowledge Graph: Architecture, Database Selection, and Search Applications
This article details Tencent Music's music knowledge graph, covering data classification, graph database evaluation, system architecture, online and offline data pipelines, advanced search use cases, and practical business scenarios, illustrating how graph technology enhances intelligent retrieval and recommendation.
Music Knowledge Graph Overview – The talk introduces the music knowledge graph, categorizing music data into content (songs, albums, etc.), artist information (profiles, relationships), and artist‑content links (performances, songwriting).
Application Scenarios – The graph enables complex searches (e.g., finding Jay Chou duets with a female singer), recommendation via two‑hop entity traversal, and knowledge‑driven answers such as retrieving 1990s songs by a specific artist.
Graph Database Selection – Evaluation criteria include open‑source cost, distributed scalability, millisecond‑level multi‑hop queries, support for billions of vertices/edges, and bulk import/export. Three families were compared: (1) Neo4j (high performance but single‑node or paid distributed), (2) JanusGraph/HugeGraph (distributed but limited by external storage), and (3) NebulaGraph (native storage, computation push‑down, best overall performance). NebulaGraph was chosen.
Project Architecture – The system consists of an online layer (Storaged for data storage, Metad for schema, Nebula graphd for stateless query execution, Nebula proxy as a gateway, and a broker that assembles queries and routes them via Zookeeper) and an offline layer handling full‑batch and incremental data pipelines using HDFS, Spark, Kafka, and DataSender. Dual‑cluster deployment ensures high availability.
Search Recall Strategies – Traditional recall follows query → statement → result → mixing → display, which is inflexible. A template‑based recall generates graph queries automatically and applies preset mixing policies, enabling rapid rollout of new scenarios such as school‑anthem or singer‑related searches.
Business Use Cases – Implemented scenarios include school‑anthem retrieval, singer‑centric recommendations (collaborations, groups), and film‑related song lookup, all powered by the knowledge graph.
Q&A Highlights – Discussed audio‑based semantic search, ranking of semantic results alongside keyword matches, index switching without double buffering, and online truncation for vector search.
Conclusion – Adopting graph data allows Tencent Music to embed expert knowledge into the search experience, improving relevance, recommendation, and visualization, and demonstrating the practical impact of knowledge‑graph technology in a large‑scale music platform.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.