Automated Knowledge Graph Representation Learning: From Triples to Subgraphs
This talk introduces automated knowledge graph representation learning, covering background, key techniques such as triple‑based, path‑based and subgraph‑based models, AutoML‑driven model search (AutoSF, Interstellar, RED‑GNN), evaluation metrics, and future research directions in AI.
Knowledge graphs are special graph structures that store entities and relations as triples (head, relation, tail). Knowledge graph representation learning maps these symbolic elements into low‑dimensional vectors, enabling efficient similarity computation and downstream tasks such as link prediction, entity matching, and classification.
The presentation is organized into four parts: (1) background on knowledge representation learning, (2) important directions including AutoML for model search, (3) model design ranging from triple‑level to subgraph‑level, and (4) a summary of findings and future work.
Triple‑based models include translation‑based methods (TransE, TransH, RotatE), neural‑network‑based methods (MLP, ConvE, RSN), and bilinear models (RESCAL, DistMult, ComplEx). Bilinear models achieve strong performance but differ in how they parameterize the relation matrix. AutoSF (and its improved version AutoSF+) automatically searches the relation‑matrix structure using progressive or genetic algorithms, achieving data‑dependent optimality.
Path‑based models extend triples to multi‑hop paths. PTransE aggregates relation vectors along a path, while RSN uses recurrent networks with skip connections. The Interstellar model performs neural‑architecture search on path structures, allowing a mixture of triple‑only and path‑aware configurations.
Subgraph‑based GNN models extract the subgraph surrounding a head‑tail pair and apply graph neural networks. Early approaches (R‑GCN, CompGCN, KE‑GCN) rely on full‑graph embeddings and have limited scalability. GraIL uses inductive subgraph reasoning without pretrained embeddings. RED‑GNN improves efficiency by stacking identical‑length paths, using dynamic programming and attention to process many subgraphs in parallel.
AutoML techniques are also applied to hyper‑parameter optimization. KGTuner performs a two‑stage search, first narrowing the space on sampled subgraphs and then fine‑tuning on the full graph. AutoML frames both hyper‑parameter and model‑structure search as a bi‑level optimization problem.
Evaluation metrics for link prediction include Mean Rank (MR), Mean Reciprocal Rank (MRR), and Hit@K, each emphasizing different aspects of ranking quality.
The summary highlights three model families: triple‑based (high efficiency, limited structural modeling), path‑based (captures sequential semantics), and subgraph‑based (captures high‑order topology). Automated model design via AutoML substantially improves performance, but challenges remain in scalability, inductive reasoning, and domain‑specific applications such as biology and finance.
Future research directions include automated GNN architecture design, efficient inference on massive graphs, reasoning over dynamic/event graphs, and broader domain adoption.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.