Artificial Intelligence 9 min read

Step-by-Step Guide to Deploying LangChain‑Chatchat with ChatGLM‑2 on a Local Machine

This article provides a comprehensive tutorial on setting up the LangChain‑Chatchat open‑source project, covering environment preparation, model and embedding downloads, configuration files, database initialization, API service launch, and example curl commands for interacting with both the large language model and the local knowledge base.

Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Step-by-Step Guide to Deploying LangChain‑Chatchat with ChatGLM‑2 on a Local Machine

I. Introduction

LangChain‑Chatchat is an open‑source question‑answering application built on LangChain concepts, designed to run offline with a local knowledge base and support Chinese scenarios and open‑source models.

The system creates embeddings (using m3e‑base in this example) for uploaded documents, stores them in a VectorStore, and retrieves the most relevant vectors to construct prompts for the LLM.

II. Environment Setup

A. Python Environment & Dependencies

Ensure your GPU’s CUDA version matches the Docker image or install the environment manually. Use Python 3.8.10 and CUDA 11.7, then install required packages:

pip install -r requirements.txt
pip install -r requirements_api.txt
pip install -r requirements_webui.txt

B. Model Download and Preparation

Install Git LFS to handle large binary files, then clone the ChatGLM‑2 model and the embedding model:

$ git clone https://huggingface.co/THUDM/chatglm2-6b
$ git clone https://huggingface.co/moka-ai/m3e-base

C. Configurations

Copy the example configuration files and rename them:

./configs/model_config.py.example → ./configs/model_config.py
./configs/server_config.py.example → ./configs/server_config.py

Example snippets:

llm_model_dict={
    "chatglm2-6b": {
        "local_model_path": "/Users/xxx/Downloads/chatglm2-6b",
        "api_base_url": "http://localhost:8888/v1",
        "api_key": "EMPTY"
    }
}

embedding_model_dict = {
    "m3e-base": "/Users/xxx/Downloads/m3e-base"
}

III. Running the Code

A. Initialize Knowledge Base

First run:

$ python init_database.py --recreate-vs

Subsequent runs:

$ python init_database.py

B. Large Model Service (llm_api.py)

Adjust the port if needed (e.g., change 8888 to 8880) and start the service. Verify with a simple OpenAI‑compatible request:

import openai
openai.api_key = "EMPTY"
openai.api_base = "http://localhost:8880/v1"
model = "chatglm2-6b"
completion = openai.ChatCompletion.create(
    model=model,
    messages=[{"role": "user", "content": "Hello! What is your name?"}]
)
print(completion.choices[0].message.content)

C. API Service (api.py)

Start the API server:

python api.py

Visit http://localhost:7861/docs to view and test the Swagger documentation.

IV. API Call Examples

A. Direct LLM Conversation

curl -X 'POST' \
  'http://localhost:7861/chat/fastchat' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "model": "chatglm2-6b",
  "messages": [{"role": "user", "content": "hello"}],
  "temperature": 0.7,
  "max_tokens": 1024,
  "stream": false
}'

B. Knowledge Base Conversation

curl -X 'POST' \
  'http://localhost:7861/chat/knowledge_base_chat' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "query": "你好",
  "knowledge_base_name": "samples",
  "top_k": 5,
  "score_threshold": 1,
  "history": [],
  "stream": false,
  "local_doc_url": false
}'

Creating a New Knowledge Base

curl -X 'POST' \
  'http://localhost:7861/knowledge_base/create_knowledge_base' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "knowledge_base_name": "test",
  "vector_store_type": "faiss",
  "embed_model": "m3e-base"
}'

Response example:

{ "code": 200, "msg": "已新增知识库 test" }

Uploading Documents to the Knowledge Base

curl -X 'POST' \
  'http://localhost:7861/knowledge_base/upload_doc' \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -F 'file=@filename;type=text/plain' \
  -F 'knowledge_base_name=test' \
  -F 'override=false' \
  -F 'not_refresh_vs_cache=false'

Response example:

{ "page_content": "xxx你的文档内容", "metadata": {"source": "文档路径"} }

These steps enable you to set up, configure, and interact with a locally hosted LLM‑powered knowledge base using LangChain‑Chatchat.

DeploymentLangChainembeddingAPIChatGLMLocal Knowledge BaseVector Store
Rare Earth Juejin Tech Community
Written by

Rare Earth Juejin Tech Community

Juejin, a tech community that helps developers grow.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.