Artificial Intelligence 11 min read

Boosting RAG Accuracy with GraphRAG: Large‑Scale KG Build and Mode Comparison

This article demonstrates how to construct a knowledge graph from a 58,099‑character document using GraphRAG, measures the indexing time on an M2 PRO machine, and compares Global, Local, and Drift query modes, highlighting three scenarios where Local mode is unsuitable.

Fun with Large Models

Jul 9, 2025

Boosting RAG Accuracy with GraphRAG: Large‑Scale KG Build and Mode Comparison

Environment Preparation

Place the public document “机器学习决策树算法详解” (58,099 characters) in ./openl_big/input. Create the folder with mkdir -p ./openl_big/input. Initialize the project with graphrag init --root ./openl_big, which creates prompts, .env, and settings.yaml. In settings.yaml replace the default OpenAI models with SiliconFlow services: set the chat model to Qwen/Qwen3-8B and the embedding model to BAAI/bge-m3, and configure the API base, key, and concurrency parameters as shown in the source.

Knowledge Graph Construction

Run graphrag index --root ./openl_big to build the KG. On an M2 PRO machine using the Qwen3‑8B model with the no_think prompt, indexing the 58 KB document takes about 30 minutes. The resulting tables are stored under openl_big\output. Using pandas the entity table, relationship table, and report (community) table can be inspected.

Retrieval Query Modes

Local mode : matches entities from the query against the entity table, retrieves related entities and their text blocks, and answers from those blocks.

Global mode : first matches the query to community reports, summarizes the matched report with an LLM, then performs a second LLM call to answer using the summary and linked entities; consumes more tokens.

Drift mode : matches query entities to KG entities, expands hierarchically to related entities, and incorporates their text into the answer.

Comparison of Global and Local Modes

Summary‑type question

Query: “请问文档中总共介绍了几种决策树算法？”

Local mode command:

graphrag query --root openl_big --method local --query "请问文档中总共介绍了几种决策树算法?"

Result: no answer because the query contains no entities present in the entity table.

Global mode command:

graphrag query --root openl_big --method global --query "请问文档中总共介绍了几种决策树算法?"

Result: answer returned together with the related community report.

Comprehensive evaluation question

Query: “你觉得文档中介绍的决策树算法，哪个算法最有应用前景？”

Local mode fails for the same reason (no matching entity). Global mode succeeds, providing an answer and citing the relevant community report.

Subjective evaluation question

Query: “你觉得这个文档写得如何？”

Local mode cannot answer; Global mode can answer by leveraging community reports.

Practical Guidance on Mode Selection

Use Local mode for queries that can be satisfied by direct entity‑text matches; it offers slightly better performance and faster response than traditional RAG.

Use Global mode for summary, comprehensive evaluation, or subjective questions, as it can retrieve and synthesize information from community reports.

Conclusion

The KG was built from a multi‑ten‑thousand‑character document with GraphRAG, with an indexing time of roughly 30 minutes on an M2 PRO. Experiments demonstrate that Global mode can answer questions that Local mode cannot, due to its reliance on community reports. For corpora larger than ~20 GB, GraphRAG retrieval slows, and alternative projects such as LightRAG and NANO‑GraphRAG are noted.

RAG Knowledge Graph GraphRAG local mode query modes global mode

Written by

Fun with Large Models

Master's graduate from Beijing Institute of Technology, published four top‑journal papers, previously worked as a developer at ByteDance and Alibaba. Currently researching large models at a major state‑owned enterprise. Committed to sharing concise, practical AI large‑model development experience, believing that AI large models will become as essential as PCs in the future. Let's start experimenting now!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.