Artificial Intelligence 20 min read

Automated Knowledge Graph Representation Learning: From Triples to Subgraphs

This talk introduces the background, key directions, and model designs for automated knowledge‑graph representation learning, covering triple‑based, path‑based, and subgraph‑based approaches, the role of AutoML in searching optimal bilinear scoring functions, and future research challenges such as scalability, inductive inference, and domain‑specific applications.

DataFunTalk
DataFunTalk
DataFunTalk
Automated Knowledge Graph Representation Learning: From Triples to Subgraphs

Knowledge graphs encode entities and relations as triples (head, relation, tail) and are used in QA, recommendation, drug discovery, and stock prediction. Knowledge representation learning maps these symbolic elements into low‑dimensional vectors, enabling efficient similarity computation and downstream tasks.

The overall framework samples positive triples from the graph, generates negative triples by corrupting head, tail, or relation, and optimizes a scoring function with regularization using stochastic gradient descent. Evaluation metrics include mean rank (MR), mean reciprocal rank (MRR), and Hit@K.

Key research directions include triple‑based models (TransE, RotatE, ConvE, RESCAL, DistMult, ComplEx), path‑based models (PTransE, RSN, Interstellar) and subgraph‑based graph neural network models (R‑GCN, CompGCN, GraIL, RED‑GNN). Each class captures different semantic and structural information, with trade‑offs in expressiveness and computational cost.

AutoML techniques such as KGTuner, AutoSF, and AutoSF+ automate the search for optimal bilinear scoring functions and hyper‑parameters. AutoSF defines a search space over relation‑matrix entries, uses progressive or genetic algorithms to explore high‑quality structures, and employs filters and predictors to reduce redundant evaluations.

Experimental results show that bilinear models generally outperform translation‑based and neural‑network‑based models, and that AutoSF/AutoSF+ achieve dataset‑specific optimal performance. Subgraph‑based GNNs like RED‑GNN attain state‑of‑the‑art results in both transductive and inductive link prediction while being more efficient than earlier GNN approaches.

The talk concludes with a summary of the three model families, emphasizes the benefits of AutoML for improving knowledge‑graph representation learning, and outlines future directions: automated GNN architecture search, scalable inference on massive graphs, reasoning over dynamic/event graphs, and applications in biology, finance, and other domains.

embeddingGraph Neural NetworksKnowledge Graphlink predictionAutoMLrepresentation learning
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.