Artificial Intelligence 11 min read

How GraphRAG Boosts Answer Accuracy with Knowledge Graphs (Part 1)

This article explains GraphRAG’s architecture, compares it with traditional RAG, and presents experimental results showing that GraphRAG’s knowledge‑graph‑driven retrieval markedly improves answer accuracy, especially on low‑match, multi‑paragraph queries.

Fun with Large Models

Jun 19, 2025

How GraphRAG Boosts Answer Accuracy with Knowledge Graphs (Part 1)

Retrieval‑augmented generation (RAG) is a core technique for large language models, but traditional RAG suffers from reduced precision as text length grows, low matching for complex questions, and difficulty handling cross‑segment entity information. The author outlines these limitations and motivates a graph‑based approach.

GraphRAG Development Timeline

Feb 2024 – Microsoft Research blog introduced the basic GraphRAG concept.

Apr 2024 – Microsoft published the paper “From Local to Global: A Graph RAG Approach to Query‑Focused Summarization,” providing academic validation.

Jun 2024 – Microsoft officially released the GraphRAG project.

Nov 2024 – Microsoft added the dynamic search component Drift_search , forming the current GraphRAG framework.

How GraphRAG Works

Traditional RAG splits documents into short chunks, matches each chunk to the query, and feeds the matched passages to a LLM via prompts. GraphRAG first constructs a large knowledge graph from the entire input using a LLM, extracts core topics, and then matches the user query against graph entities and relationships. Relevant passages are retrieved based on these graph connections and finally injected into the LLM for answer generation. This graph‑based linking enables comprehensive cross‑segment reasoning that traditional RAG lacks.

Overall, GraphRAG pre‑builds a document‑specific knowledge graph, and retrieval operates on this graph, yielding more logical and complete answers.

Retrieval Characteristics

High accuracy but high token consumption (graph construction uses many model tokens).

Performance heavily depends on the underlying LLM.

Flexibility varies with graph‑construction strategies.

Initially Microsoft claimed only GPT‑3.5/4 could satisfy GraphRAG’s token demands, but newer models such as DeepSeek, Qwen, Gemini, and Claude have largely removed this bottleneck.

Performance Comparison Experiment

The author evaluated GraphRAG against the open‑source RAG framework RagFlow using the Microsoft GraphRAG GitHub repository (https://github.com/microsoft/graphrag) and a 600‑page PDF of the book “Classic Machine Learning,” which contains code, text, images, and execution results.

2.1 Small‑Scope Targeted Questions

Two questions whose answers reside in a single document segment were posed to both systems.

问：请帮我介绍下K-Means快速聚类基本原理
答： K-Means快速聚类是一种基于距离的聚类算法，其基本原理是将数据集中的点划分为K个簇，使得每个点都尽可能靠近其所属簇的中心点。以下是K-Means快速聚类的基本原理和执行流程:K-Means快速聚类原理..

问:请帮我介绍下Lesson 8.3总共介绍了几种决策树算法?
答:文档中介绍了以下几种决策树算法:ID3算法、C4.5算法、C5.0算法。文档明确提到了这三种算法，并详细描述了它们的背景、特点和关系。

Traditional RAG performed adequately on these focused queries.

2.2 Broad Low‑Match Questions

When asked a broad question that lacks an exact matching paragraph, traditional RAG produced vague or incorrect answers:

问:你觉得该书籍质量如何?
答:文档中并未直接提及书籍的质量。因此，该信息不是来自文档。根据文档内容，我们可以了解到书籍内容涵盖了机器学习的基本概念、建模流程以及线性回归模型的训练，但并未包含对书籍质量的评价。

Another example:

Q:现在你的知识库里面的内容是《经典机器学习》书籍，其中总共介绍了多少种机器学习算法?
A:文档中没有明确介绍具体的机器学习算法种类数量。文档内容主要围绕机器学习的基本概念和建模流程展开，并没有列举或介绍具体的算法。因此，无法从提供的文档内容中确定介绍了多少种机器学习算法。

These failures stem from the absence of a directly relevant paragraph.

GraphRAG, however, answered the same low‑match question by extracting multiple relevant algorithms from the entire book:

问:现在你的知识库里面的内容是《经典机器学习》书籍，其中总共介绍了多少种机器学习算法?
答:资料中介绍了以下几种机器学习算法:K-Means聚类(包括K-Means和Mini Batch KMeans)
DBSCAN密度聚类线性回归...

GraphRAG not only provided a direct answer but also supplemented it with additional context, improving completeness.

Conclusion

The article demonstrates that GraphRAG’s knowledge‑graph construction enables higher answer accuracy and better handling of low‑match, multi‑paragraph queries compared with traditional RAG. Its flexibility and reliance on powerful LLMs are trade‑offs, but recent model advances mitigate the performance bottleneck.

large language models RAG performance evaluation Knowledge Graph GraphRAG retrieval-augmented generation

Written by

Fun with Large Models

Master's graduate from Beijing Institute of Technology, published four top‑journal papers, previously worked as a developer at ByteDance and Alibaba. Currently researching large models at a major state‑owned enterprise. Committed to sharing concise, practical AI large‑model development experience, believing that AI large models will become as essential as PCs in the future. Let's start experimenting now!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.