Challenges and Future Directions for Knowledge Graph Construction in the Era of Large Models
The article examines the high construction cost and lack of unified standards in knowledge graphs, explains why large language models cannot fully solve core issues such as hallucination and multi‑hop reasoning, and argues that a new, unified semantic framework integrating large models is essential for future progress.
Knowledge graphs have long suffered from high construction complexity and cost, lacking a unified framework that can address all knowledge building problems; this stems from the data layer where most knowledge is expressed in natural language with diverse and non‑standardized representations.
Consequently, the rapid rise of large models has sparked claims that "knowledge graphs are doomed," yet experts note that current large models still cannot resolve hallucination, timeliness, factuality, or multi‑hop reasoning, and thus cannot permanently solve knowledge‑graph construction challenges.
From the perspective of application scenario structure, such as risk control, textual representations of risk groups are incomprehensible; the industry doubts that large models can replace core risk‑control capabilities, so highly structured decision scenarios still rely on knowledge graphs for significant impact.
The biggest current difficulty for knowledge graphs is the lack of uniformity; their semantic standards are outdated, originating from the early semantic web era and evolving over decades, with the term "knowledge graph" itself coined by Google in 2012.
Over this long development period, many concepts have emerged—static, dynamic, entity, concept, event, causal, temporal, multimodal, and others—each with its own definition and representation.
Although early semantic web frameworks like RDF and OWL existed, they never took hold in industry, leading practitioners to store knowledge graphs based on attributes using graph databases.
However, graph databases differ from knowledge graphs: they are merely storage formats without semantics, resulting in each organization defining its own knowledge graph, which hampers data exchange due to protocol differences.
Therefore, a future unified semantic framework that can be integrated into industrial scenarios is crucial, involving the definition of key semantic capabilities, a standard knowledge‑construction pipeline, and the fusion of various inference engines such as expert rule reasoning and graph representation learning.
The most important step remains the integration with large models, which still faces many difficulties.
In practice, building knowledge graphs with large models reveals that hallucinations introduce noisy data requiring cleaning, diverse data extraction demands collaboration between large and small models, and current knowledge representation must evolve to become more model‑friendly; existing best frameworks were created before the large‑model era.
Achieving a unified, standardized knowledge graph will only be possible after deep integration with large models, a journey that still has a long way to go.
Knowledge Graph Summit Details
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.