Building a Private LLM‑Powered Knowledge Base with LangChain and ChatGLM3
This article explains how to migrate personal notes into a private knowledge base by combining a large language model with an external vector store, detailing the concepts of tokenization, embedding, vector databases, and step‑by‑step deployment using LangChain‑Chatchat and the open‑source ChatGLM3 model.
After moving personal notes from Yuque to Obsidian, the author decided to reuse the accumulated markdown files by building a private knowledge base that leverages a large language model (LLM) to retrieve and synthesize relevant information.
What is an external knowledge base?
ChatGPT can generate fluent text and perform reasoning, but its answers are limited to the data it was trained on (up to 2022) and cannot directly access user‑provided documents. By feeding relevant document fragments as part of the prompt, one can overcome the context length limitation, turning the problem into a retrieval task: store documents in a vector store, retrieve the most similar chunks for a query, and include them in the prompt.
The pipeline consists of tokenizing the text, converting tokens into embeddings, and persisting the vectors in a vector database (e.g., FAISS).
Solution selection
Four main approaches are considered: (1) ChatGPT + fine‑tuning, (2) ChatGPT + retrieval plugin, (3) open‑source LLM + fine‑tuning, and (4) LangChain + open‑source LLM. The author chose option 4 to avoid OpenAI dependencies and reduce complexity, selecting LangChain‑Chatchat together with the Chinese open‑source model ChatGLM3‑6B.
LangChain‑Chatchat overview
LangChain‑Chatchat implements the following workflow:
load files → read text → split text → embed text → embed query → retrieve top‑k similar chunks → construct prompt → call LLM → generate answerThe framework provides an out‑of‑the‑box solution that can be privately deployed.
Environment preparation and installation
Clone the required repositories and set up the environment:
# /mnt/workspace/
git clone https://www.modelscope.cn/ZhipuAI/chatglm3-6b.git
git clone https://www.modelscope.cn/Jerry0/text2vec-bge-large-chinese.git
git clone https://github.com/chatchat-space/Langchain-Chatchat.git
cd Langchain-Chatchat
git checkout origin v0.2.2
pip install -r requirements.txt
pip install -r requirements_api.txt
pip install -r requirements_webui.txt
cp configs/model_config.py.example configs/model_config.py
cp configs/server_config.py.example configs/server_config.pyModify configs/model_config.py to point to the downloaded embedding model and LLM:
embedding_model_dict = {
"text2vec-base": "/mnt/workspace/text2vec-bge-large-chinese",
...
}
llm_model_dict = {
"chatglm3-6b": {
"local_model_path": "/mnt/workspace/chatglm3-6b",
"api_base_url": "",
"api_key": "",
},
...
}
EMBEDDING_MODEL = "text2vec-base"
LLM_MODEL = "chatglm3-6b"Start the service:
python startup.py -aWhen the startup message appears, open the provided URL to access the web UI.
Practical usage
The system can be used in pure chat mode or with an attached knowledge base. After importing documents and creating a knowledge base, queries are answered by retrieving relevant chunks and letting the LLM generate a response. Demo screenshots show successful retrieval from a previously written article on front‑end RBAC.
Conclusion
The experiment demonstrates that a locally deployed LLM combined with a vector store can effectively recycle personal notes and provide private, document‑grounded Q&A. Future improvements include using larger models (e.g., ChatGLM‑130B) for better performance.
Rare Earth Juejin Tech Community
Juejin, a tech community that helps developers grow.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.