How to Build a Knowledge Graph from Scratch: Bottom‑Up Techniques Explained

This article explains the fundamentals of knowledge graphs, compares top‑down and bottom‑up construction methods, describes data types, storage options, logical and technical architectures, and walks through the iterative steps of information extraction, knowledge fusion, processing, updating, and real‑world applications.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
How to Build a Knowledge Graph from Scratch: Bottom‑Up Techniques Explained

What Is a Knowledge Graph?

A knowledge graph is a structured semantic knowledge base that models concepts and their relationships, turning document‑level data into fine‑grained triples (entity‑relationship‑entity) for fast reasoning and inference.

Construction Approaches

Two main strategies exist: top‑down, which extracts ontologies from structured sources like encyclopedias, and bottom‑up, which automatically extracts entities, relations, and attributes from heterogeneous raw data. The article focuses on the bottom‑up pipeline.

Data Types and Storage

Raw data falls into three categories: structured (e.g., relational databases), unstructured (images, audio, video), and semi‑structured (XML, JSON, wiki pages). Storage can use RDF formats (e.g., Jena) or graph databases such as Neo4j.

Architecture

The architecture consists of a logical layer (data layer and schema layer) and a technical layer that orchestrates the construction process.

Logical Layer

The schema layer stores refined knowledge (ontologies), while the data layer holds raw facts.

Technical Layer

The pipeline iterates through three stages: information extraction, knowledge fusion, and knowledge processing.

1. Information Extraction

Extracts entities, relations, and attributes from heterogeneous sources.

Entity Extraction (NER)

Identifies named entities in text, historically moving from domain‑specific to open‑domain extraction.

Relation Extraction

Detects semantic links between entities, evolving from rule‑based to statistical and deep‑learning methods.

Attribute Extraction

Collects attribute values (e.g., birthdate, nationality) for each entity, often by treating attribute extraction as a special case of relation extraction.

2. Knowledge Fusion

Aligns and merges extracted facts to resolve ambiguities.

Entity Linking

Maps extracted mentions to canonical entities in a knowledge base using similarity scoring and collective linking.

Knowledge Merging

Combines external structured sources (e.g., existing ontologies, relational databases) with the extracted graph, handling conflicts via schema alignment.

3. Knowledge Processing

Transforms the fused graph into a usable knowledge base.

Ontology Construction

Builds or extends ontologies by computing entity similarity, extracting hierarchical (is‑a) relations, and generating the final schema.

Knowledge Reasoning

Applies logical, graph‑based, or deep‑learning inference to infer missing relations or attribute values (e.g., deducing a person’s residence from spouse and organization links).

Quality Assessment

Evaluates confidence scores and discards low‑confidence triples to ensure reliability.

Knowledge Update

Updates occur at the concept layer (new concepts) and data layer (new/changed triples). Two strategies are supported: full rebuilds and incremental updates.

Applications

Intelligent search and entity‑centric results.

Person‑relationship graphs for richer profiling.

Fraud detection using heterogeneous relational patterns.

Consistency verification via reasoning.

Anomaly analysis (static and dynamic) on graph structures.

Lost‑customer management and outreach.

References

[1] Liu Q., Li Y., Duan H., et al. “A Survey of Knowledge Graph Construction Techniques.” *Computer Research and Development*, 2016. [2] “Knowledge Graph Technical Tips,” CSDN. [3] Ehrlinger L., Wöß W., “Towards a Definition of Knowledge Graphs,” 2016. [4] Das R., Neelakantan A., Bélanger D., et al., “Chains of Reasoning over Entities, Relations, and Text using Recurrent Neural Networks,” 2016.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

graph databaseKnowledge GraphInformation ExtractionOntologysemantic webknowledge fusion
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.