Build a Retrieval‑Augmented Generation (RAG) Chatbot with LangChain and Streamlit
This guide walks through the complete process of creating a RAG‑powered question‑answering bot using LangChain, Streamlit, and vector‑store embeddings, covering theory, architecture, data loading, chunking, vector indexing, retrieval, LLM integration, and full code implementation with practical examples.
RAG概念
RAG (Retrieval‑Augmented Generation) combines a retriever that selects relevant document chunks with a generator (LLM) that produces answers using those chunks.
Typical RAG pipeline
Load documents → split into chunks → embed chunks → store vectors in a vector store → retrieve relevant chunks for a query → feed retrieved chunks into LLM prompt → generate final answer.
Indexing (vector store creation)
Load data with LangChain loaders (e.g., PDFLoader, TextLoader, ImageLoader).
Split documents using RecursiveCharacterTextSplitter (chunk size 1000, overlap 200).
Generate embeddings with an OllamaEmbeddings model (e.g., shaw/dmeta-embedding-zh) and store them in a Chroma vector store.
Embedding model role
Embeddings map high‑dimensional content (text, images, video) to low‑dimensional vectors, enabling efficient similarity search.
Retrieval and generation
Convert user query to a vector with the same embedding model.
Search the vector store for the top‑k most similar chunks.
Insert those chunks into the LLM prompt and generate the final answer.
Implementation details
Environment setup
# Install models via Ollama
ollama install deepseek-r1:7b
ollama install shaw/dmeta-embedding-zh:latest
# Python dependencies (versions used)
pip install streamlit==1.39.0
pip install langchain==0.3.21
pip install langchain-chroma==0.2.2
pip install langchain-community==0.3.20
pip install langchain-ollama==0.2.3Streamlit UI (bot_chat.py)
Key sections:
File uploader – sidebar widget that accepts .txt files.
import streamlit as st
st.set_page_config(page_title="RAG测试问答", layout="wide")
st.title("RAG测试问答")
upload_file = st.sidebar.file_uploader(label="上传文件", type=["txt"])
if not upload_file:
st.info("请上传 txt 文件")
st.stop()Knowledge‑base construction – cached function that writes the uploaded file to /tmp, loads it with TextLoader, splits, embeds, and creates a Chroma retriever.
@st.cache_resource(ttl="1h")
def get_knowledge_base(uploaded_file):
import tempfile, os
from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_ollama.embeddings import OllamaEmbeddings
from langchain_chroma import Chroma
temp_dir = tempfile.TemporaryDirectory(dir="/tmp")
temp_path = os.path.join(temp_dir.name, uploaded_file.name)
with open(temp_path, "wb") as f:
f.write(uploaded_file.getvalue())
docs = TextLoader(temp_path, encoding="utf-8").load()
splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = splitter.split_documents(docs)
embeddings = OllamaEmbeddings(base_url="http://127.0.0.1:11434",
model="shaw/dmeta-embedding-zh")
chroma_db = Chroma.from_documents(splits, embeddings)
return chroma_db.as_retriever()
retriever = get_knowledge_base(upload_file)Chat history handling – session state stores messages; Streamlit displays them with st.chat_message.
if "messages" not in st.session_state or st.sidebar.button("清空聊天记录"):
st.session_state["messages"] = [{"role": "assistant",
"content": "我是测试 RAG 问答小助手"}]
for msg in st.session_state["messages"]:
st.chat_message(msg["role"]).write(msg["content"])
user_query = st.chat_input(placeholder="请输入要测试的问题")Retriever tool and ReAct agent – creates a LangChain tool wrapping the retriever, defines a system prompt, instantiates an OllamaLLM (deepseek‑r1:7b), and builds a ReAct agent.
from langchain.tools.retriever import create_retriever_tool
from langchain.prompts import PromptTemplate
from langchain_ollama import OllamaLLM
from langchain.agents import create_react_agent, AgentExecutor
tool = create_retriever_tool(retriever=retriever,
name="文档检索",
description="根据关键词检索相关文档")
tools = [tool]
instruction = """你是一个设计用于查询文档回答问题的代理...如果从文档找不到任何信息,返回‘非常抱歉,这个问题暂时没有录入到知识库中。’"""
base_template = """{instruction}
TOOLS:
{tools}
...
{input}
{agent_scratchpad}"""
prompt = PromptTemplate.from_template(base_template).partial(instruction=instruction)
llm = OllamaLLM(base_url="http://127.0.0.1:11434", model="deepseek-r1:7b")
agent = create_react_agent(llm=llm, prompt=prompt, tools=tools)
agent_executor = AgentExecutor(agent=agent,
tools=tools,
memory=None,
verbose=True,
handle_parsing_errors="从知识库没找到对应内容或者答案")User query execution – appends the query to history, runs the agent with a StreamlitCallbackHandler to show intermediate reasoning, and displays the final answer.
if user_query:
st.session_state["messages"].append({"role": "user", "content": user_query})
st.chat_message("user").write(user_query)
with st.chat_message("assistant"):
from langchain.callbacks import StreamlitCallbackHandler
callback = StreamlitCallbackHandler(st.container())
response = agent_executor.invoke({"input": user_query},
config={"callbacks": [callback]})
answer = response["output"]
st.session_state["messages"].append({"role": "assistant", "content": answer})
st.write(answer)Running the app
streamlit run bot_chat.pyKey observations
Chunk size 1000 tokens with 200‑token overlap balances retrieval relevance and vector store size.
Using OllamaEmbeddings locally avoids external API latency.
Chroma provides an in‑memory vector store suitable for prototyping; for production replace with a persistent store (e.g., Pinecone, Milvus).
The ReAct agent’s prompt explicitly forces a retrieval step even when the LLM “knows” the answer, ensuring consistency with the knowledge base.
References
LangChain documentation https://python.langchain.com/docs/introduction/
LangChain Chinese guide https://github.com/liaokongVFX/LangChain-Chinese-Getting-Started-Guide
LangChain agents article https://blog.csdn.net/qq_56591814/article/details/135040694
DeepSeek‑R1 model https://chat.deepseek.com/
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Qborfy AI
A knowledge base that logs daily experiences and learning journeys, sharing them with you to grow together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
