Artificial Intelligence 8 min read

Entity Contrastive Learning via Multi-Token Parallel Prediction for Knowledge Graph Completion

Researchers from Ant Group and Zhejiang University propose K-ON, a multi-token parallel prediction method that enables large language models to perceive knowledge graph entities through entity-level contrastive learning, achieving superior performance, lower cost, and higher efficiency on KG completion benchmarks.

AntTech

Feb 27, 2025

Entity Contrastive Learning via Multi-Token Parallel Prediction for Knowledge Graph Completion

The rapid development of large language models (LLMs) has broken many barriers in natural language processing, but their token‑wise prediction objective mismatches the multi‑token nature of knowledge‑graph entities. To bridge this gap, Ant Group and Zhejiang University introduce K‑ON, a multi‑token parallel prediction approach that allows LLMs to learn entity‑level representations via contrastive learning.

K‑ON treats knowledge‑graph completion as a textual instruction fed to the LLM. After the Transformer encoder processes the input, a dedicated K‑ON module—comprising multiple MLP heads corresponding to the positions of an entity’s tokens—receives the hidden states. A Conditional Transformer mixes positional information while respecting token order dependencies.

Low‑rank adaptation (LoRA) expands the original LLM scoring layer into K new scoring heads, producing probability distributions for each token position of every candidate entity in parallel. These distributions are then aligned with conventional single‑step token predictions, and a contrastive loss (positive vs. negative entity scores) is applied to embed knowledge‑graph structure into the model.

The training pipeline consists of five steps: (1) format KG completion as a text instruction; (2) feed the encoded representation to K‑ON’s multiple heads; (3) aggregate positional information via Conditional Transformer; (4) use LoRA to generate K parallel scoring layers; (5) extract token‑wise probabilities to score all candidate entities in one pass.

Experimental results on several KG completion datasets show that K‑ON consistently outperforms traditional methods, other LLM‑based approaches, and even multimodal models that use additional image data. Increasing the token count K improves performance up to K≈8, after which gains plateau while model size continues to grow. Inference time remains largely unaffected by K, demonstrating high efficiency.

Further analysis reveals that K‑ON’s entity‑level contrastive learning can handle thousands of negative samples with minimal training overhead; setting around 128 negatives yields optimal results.

Overall, K‑ON provides a more efficient, cost‑effective, and higher‑performing solution for knowledge‑graph completion, enabling LLMs to directly perceive and reason over KG knowledge.

Paper: https://arxiv.org/pdf/2502.06257

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Knowledge Graph entity contrastive learning K-ON multi-token prediction

Written by

AntTech

Technology is the core driver of Ant's future creation.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.