Step-by-Step Guide to Deploying LangChain‑Chatchat with ChatGLM‑2 on a Local Machine
This article provides a comprehensive tutorial on setting up the LangChain‑Chatchat open‑source project, covering environment preparation, model and embedding downloads, configuration files, database initialization, API service launch, and example curl commands for interacting with both the large language model and the local knowledge base.
I. Introduction
LangChain‑Chatchat is an open‑source question‑answering application built on LangChain concepts, designed to run offline with a local knowledge base and support Chinese scenarios and open‑source models.
The system creates embeddings (using m3e‑base in this example) for uploaded documents, stores them in a VectorStore, and retrieves the most relevant vectors to construct prompts for the LLM.
II. Environment Setup
A. Python Environment & Dependencies
Ensure your GPU’s CUDA version matches the Docker image or install the environment manually. Use Python 3.8.10 and CUDA 11.7, then install required packages:
pip install -r requirements.txt
pip install -r requirements_api.txt
pip install -r requirements_webui.txtB. Model Download and Preparation
Install Git LFS to handle large binary files, then clone the ChatGLM‑2 model and the embedding model:
$ git clone https://huggingface.co/THUDM/chatglm2-6b
$ git clone https://huggingface.co/moka-ai/m3e-baseC. Configurations
Copy the example configuration files and rename them:
./configs/model_config.py.example → ./configs/model_config.py
./configs/server_config.py.example → ./configs/server_config.pyExample snippets:
llm_model_dict={
"chatglm2-6b": {
"local_model_path": "/Users/xxx/Downloads/chatglm2-6b",
"api_base_url": "http://localhost:8888/v1",
"api_key": "EMPTY"
}
}
embedding_model_dict = {
"m3e-base": "/Users/xxx/Downloads/m3e-base"
}III. Running the Code
A. Initialize Knowledge Base
First run:
$ python init_database.py --recreate-vsSubsequent runs:
$ python init_database.pyB. Large Model Service (llm_api.py)
Adjust the port if needed (e.g., change 8888 to 8880) and start the service. Verify with a simple OpenAI‑compatible request:
import openai
openai.api_key = "EMPTY"
openai.api_base = "http://localhost:8880/v1"
model = "chatglm2-6b"
completion = openai.ChatCompletion.create(
model=model,
messages=[{"role": "user", "content": "Hello! What is your name?"}]
)
print(completion.choices[0].message.content)C. API Service (api.py)
Start the API server:
python api.pyVisit http://localhost:7861/docs to view and test the Swagger documentation.
IV. API Call Examples
A. Direct LLM Conversation
curl -X 'POST' \
'http://localhost:7861/chat/fastchat' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"model": "chatglm2-6b",
"messages": [{"role": "user", "content": "hello"}],
"temperature": 0.7,
"max_tokens": 1024,
"stream": false
}'B. Knowledge Base Conversation
curl -X 'POST' \
'http://localhost:7861/chat/knowledge_base_chat' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"query": "你好",
"knowledge_base_name": "samples",
"top_k": 5,
"score_threshold": 1,
"history": [],
"stream": false,
"local_doc_url": false
}'Creating a New Knowledge Base
curl -X 'POST' \
'http://localhost:7861/knowledge_base/create_knowledge_base' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"knowledge_base_name": "test",
"vector_store_type": "faiss",
"embed_model": "m3e-base"
}'Response example:
{ "code": 200, "msg": "已新增知识库 test" }Uploading Documents to the Knowledge Base
curl -X 'POST' \
'http://localhost:7861/knowledge_base/upload_doc' \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-F 'file=@filename;type=text/plain' \
-F 'knowledge_base_name=test' \
-F 'override=false' \
-F 'not_refresh_vs_cache=false'Response example:
{ "page_content": "xxx你的文档内容", "metadata": {"source": "文档路径"} }These steps enable you to set up, configure, and interact with a locally hosted LLM‑powered knowledge base using LangChain‑Chatchat.
Rare Earth Juejin Tech Community
Juejin, a tech community that helps developers grow.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.