Advances in Knowledge Graph Construction: AI Development, Named Entity Recognition, Relation Extraction, and Attribute Completion
This technical report presents a comprehensive overview of artificial intelligence evolution, knowledge‑graph construction techniques—including traditional, cross‑lingual and reading‑comprehension based named entity recognition, weak‑supervised and joint relation extraction, attribute completion via multi‑source cues, and conditional knowledge‑graph modeling—highlighting recent research findings and experimental results.
The speaker, Dr. Liu Ming from Harbin Institute of Technology, introduces a technical report covering four main topics: the history of artificial intelligence, named entity recognition (NER), automatic relation identification, and automatic completion of missing entity attributes.
AI Development Stages : AI development is divided into three stages—computational intelligence, perceptual intelligence, and cognitive intelligence—emphasizing the role of knowledge graphs as the foundation for machine cognition.
Knowledge Extraction : Knowledge can be extracted from text either by building structured knowledge bases or by using knowledge graphs to enhance text understanding, supporting downstream tasks such as dialogue and QA.
Named Entity Recognition : Three categories: traditional NER, fine‑grained NER, and open‑domain NER. Traditional NER identifies person, location, and organization names; early methods used dictionaries and heuristics, later statistical models, and since 2013 deep learning (e.g., LSTM+CNN+CRF) has become dominant. Cross‑lingual NER leverages bilingual dictionaries to transfer knowledge from English to Chinese, using attention to select appropriate translations. Reading‑comprehension based NER treats entity extraction as a QA problem, handling nested entities by generating type‑specific questions. Open‑domain NER discovers entity types automatically, avoiding exhaustive predefined label sets.
Relation Extraction : Relations are categorized as hierarchical (is‑a) and horizontal (entity‑entity) links. Weak‑supervised methods generate large training sets by retrofitting knowledge‑base triples onto text, but introduce noise; attention mechanisms mitigate noise. A deep memory‑network model combines word‑level and relation‑level memory networks with attention to capture important words and relation dependencies. Synchronous joint extraction models integrate entity and relation extraction using a reading‑comprehension framework, addressing overlapping triples by driving extraction with relation queries.
Attribute Completion : Attributes describe entities; many entities lack attributes in existing graphs. Multi‑source cues (search results, online encyclopedias, core words) provide candidate attribute labels, which are ranked heuristically. Concept‑based attribute recommendation uses shared hypernyms (e.g., "pain‑relief tablet") to transfer attributes between entities. A model encodes concept paths with LSTM, applies attention to weight multiple paths, and scores entity‑attribute pairs.
Conditional Knowledge Graphs : Traditional KG stores only factual triples; conditional KG adds condition triples that constrain facts (e.g., "fifth president" → "term 2007‑2012"). The proposed dynamic multi‑input‑output model treats fact and condition extraction as a sequence labeling task, assigning role tags to each token.
The presentation concludes with a summary of the discussed methods and an invitation to follow the DataFunTalk community for further AI and big‑data knowledge sharing.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.