64 min read

Master AI Agents with LangGraph: Build Adaptive RAG, Translation, and ReAct Agents

This comprehensive guide explains what an AI Agent is, its core capabilities and design patterns, and walks through step‑by‑step implementations of RAG, Translation, and ReAct agents using LangGraph, complete with code samples, workflow diagrams, and practical tips for building personal ops knowledge‑base agents.

Ops Development Stories

Jul 29, 2025

Master AI Agents with LangGraph: Build Adaptive RAG, Translation, and ReAct Agents

What Is an Agent?

An Agent is an AI system that can understand user instructions, think, plan, call tools, and act autonomously to achieve complex goals. It combines large language model (LLM) capabilities with external tool integration and memory.

Four Core Capabilities of an Agent

Perception : Understands user input, environment, and context.

Thinking : Performs logical reasoning, task decomposition, and decision making.

Action : Calls external tools or APIs to execute tasks.

Memory : Stores and retrieves historical information for multi‑turn conversations.

Four Design Patterns for Agents

Feedback (Reflective) Mode : The agent self‑reflects on its output and iteratively improves it.

Tool‑Calling Mode : The agent invokes predefined tools (functions, APIs) to fetch data or perform operations.

Planning Mode : The agent breaks a complex task into smaller steps and solves them sequentially.

Multi‑Agent Collaboration Mode : Multiple agents work in parallel, each handling a sub‑task, improving efficiency.

Building an Agent from Scratch

Development Goal

Implement a RAG (Retrieval‑Augmented Generation) Agent that:

Retrieves relevant document chunks from a vector store.

Checks whether the retrieved chunks can answer the question.

Iteratively expands the retrieval window (up to 15 rounds) until a satisfactory answer is found.

Uses Python to implement the loop.

Implementation Steps

1. Install Dependencies

pip install -qU langchain-openai langchain langchain_community langchainhub

pip install chromadb==0.5.3

2. Import Packages

from langchain_openai import ChatOpenAI
from langchain.schema import StrOutputParser
from langchain.schema.runnable import RunnablePassthrough
from langchain.text_splitter import MarkdownHeaderTextSplitter
from langchain_openai import OpenAIEmbeddings
import os
from langchain_community.vectorstores.chroma import Chroma
from langchain_core.prompts import PromptTemplate
from string import Template
import uuid

3. Vectorize the Knowledge Base

# Read the markdown knowledge base
file_path = os.path.join('data', 'data.md')
with open(file_path, 'r', encoding='utf-8') as file:
    docs_string = file.read()
# Split by markdown headers
headers_to_split_on = [('#', 'Header 1'), ('##', 'Header 2'), ('###', 'Header 3')]
text_splitter = MarkdownHeaderTextSplitter(headers_to_split_on=headers_to_split_on)
splits = text_splitter.split_text(docs_string)
# Create embeddings and store in Chroma
embedding = OpenAIEmbeddings(model='text-embedding-3-small', openai_api_key='sk-xxxx', openai_api_base='https://vip.apiyi.com/v1')
vectorstore = Chroma.from_documents(documents=splits, embedding=embedding, persist_directory=str(uuid.uuid4()), collection_name='rag-chroma')
retriever = vectorstore.as_retriever()

4. Traditional RAG (Baseline)

def old_rag():
    retriever = vectorstore.as_retriever()
    template = """使用以下上下文来回答最后的问题。
如果你不知道答案，就说不知道，不要试图编造答案。
最多使用三句话，并尽可能简洁地回答。
在答案的最后一定要说“谢谢询问！”。

{context}

Question: {question}

Helpful Answer:"""
    custom_rag_prompt = PromptTemplate.from_template(template)
    llm = ChatOpenAI(model='gpt-4o', api_key='sk-xxxx', base_url='https://vip.apiyi.com/v1')
    rag_chain = (
        {"context": retriever | format_docs, "question": RunnablePassthrough()}
        | custom_rag_prompt
        | llm
        | StrOutputParser()
    )
    res = rag_chain.invoke("谁管的系统最多？")
    print("

LLM 回答：", res)

5. Adaptive RAG Loop

def new_rag(question="谁管的系统最多？"):
    original_question = question
    def get_search_query(question):
        if "负责" in question or "管" in question or "系统最多" in question:
            return "业务负责人 微服务负责人 系统最多"
        elif "响应时间" in question or "性能" in question or "慢" in question:
            return "响应时间 性能优化 解决方案"
        elif "安全" in question or "攻击" in question:
            return "安全问题 安全策略 DDoS"
        elif "宕机" in question or "故障" in question:
            return "系统宕机 故障排查 应急响应"
        else:
            return question
    search_query = get_search_query(original_question)
    llm = ChatOpenAI(model='gpt-4o', api_key='sk-xxxx', base_url='https://vip.apiyi.com/v1').with_structured_output(AnswerabilityCheck)
    k = 1
    excluded_docs = set()
    final_docs = []
    while k <= 15:
        print(f"第 {k} 次检索")
        docs_with_scores = search_docs(search_query, k, excluded_docs)
        if not docs_with_scores:
            print("没有更多可用的文档块")
            break
        sliced_docs = slice_context(docs_with_scores, MAX_CONTEXT_LENGTH)
        docs_content = format_docs_content(sliced_docs)
        # Structured check whether the context can answer the question
        check_prompt = f"""请分析以下上下文是否足够回答用户问题。

上下文：
{docs_content}

问题：{original_question}

请输出 can_answer (true/false)、confidence (0-1) 和 reason。"""
        result = llm.invoke(check_prompt)
        if result.can_answer and result.confidence >= 0.7:
            final_docs = sliced_docs
            break
        else:
            for doc, _ in docs_with_scores:
                excluded_docs.add(hash(doc.page_content))
            k += 1
    if not final_docs:
        print("未找到足够的上下文来回答问题")
        return
    final_context = format_docs_content(final_docs)
    final_prompt = f"""您是问答任务的助手，使用检索到的上下文来回答用户提出的问题。
如果你不知道答案，就说不知道。最多使用三句话并保持答案简洁。

上下文：
{final_context}

问题：{original_question}

请提供准确、简洁的答案："""
    final_response = llm.invoke(final_prompt)
    print("
最终答案:", final_response.content)
    print(f"检索统计: 总检索次数={k}, 排除文档块数={len(excluded_docs)}, 最终使用文档块数={len(final_docs)}")

Translation Agent Source Code and Architecture

The Translation Agent from Andrew Ng’s open‑source project follows a three‑step workflow: initial translation, reflection, and improvement.

1. Initial Translation

def one_chunk_initial_translation(source_lang: str, target_lang: str, source_text: str) -> str:
    system_message = f"You are an expert linguist, specializing in translation from {source_lang} to {target_lang}."
    translation_prompt = f"""This is an {source_lang} to {target_lang} translation, please provide the {target_lang} translation for this text.
Do not provide any explanations or text apart from the translation.
{source_lang}: {source_text}
{target_lang}:"""
    translation = get_completion(translation_prompt, system_message=system_message)
    return translation

2. Reflection

def one_chunk_reflect_on_translation(source_lang: str, target_lang: str, source_text: str, translation_1: str, country: str = "") -> str:
    system_message = f"You are an expert linguist specializing in translation from {source_lang} to {target_lang}. You will be provided with a source text and its translation and your goal is to improve the translation."
    if country:
        reflection_prompt = f"""Your task is to carefully read a source text and a translation from {source_lang} to {target_lang}, and then give constructive criticism and helpful suggestions to improve the translation. The final style and tone should match the style of {target_lang} colloquially spoken in {country}.

< SOURCE_TEXT >
{source_text}
< /SOURCE_TEXT >
< TRANSLATION >
{translation_1}
< /TRANSLATION >
When writing suggestions, pay attention to (i) accuracy, (ii) fluency, (iii) style, (iv) terminology. Output only the suggestions and nothing else."""
    else:
        reflection_prompt = f"""Your task is to carefully read a source text and a translation from {source_lang} to {target_lang}, and then give constructive criticism and helpful suggestions to improve the translation.

< SOURCE_TEXT >
{source_text}
< /SOURCE_TEXT >
< TRANSLATION >
{translation_1}
< /TRANSLATION >
When writing suggestions, pay attention to (i) accuracy, (ii) fluency, (iii) style, (iv) terminology. Output only the suggestions and nothing else."""
    reflection = get_completion(reflection_prompt, system_message=system_message)
    return reflection

3. Improvement

def one_chunk_improve_translation(source_lang: str, target_lang: str, source_text: str, translation_1: str, reflection: str) -> str:
    system_message = f"You are an expert linguist, specializing in translation editing from {source_lang} to {target_lang}."
    improve_prompt = f"""Your task is to carefully read, then edit, a translation from {source_lang} to {target_lang}, taking into account a list of expert suggestions and constructive criticisms.

< SOURCE_TEXT >
{source_text}
< /SOURCE_TEXT >
< TRANSLATION >
{translation_1}
< /TRANSLATION >
< EXPERT_SUGGESTIONS >
{reflection}
< /EXPERT_SUGGESTIONS >
Please edit the translation ensuring (i) accuracy, (ii) fluency, (iii) style, (iv) terminology, (v) other errors. Output only the new translation and nothing else."""
    translation_2 = get_completion(improve_prompt, system_message=system_message)
    return translation_2

LangGraph Agent Development Practice

What Is LangGraph?

LangGraph is a library built on top of LangChain that models an AI application as a directed graph, allowing explicit state management, conditional branching, loops, and parallel execution.

Core Concepts

Stateful Graph : Stores persistent state (memory) across graph executions.

Nodes : Functions or computational steps that read/write the state.

Edges : Define the flow between nodes; conditional edges enable dynamic routing.

Example: Simple Chatbot with LangGraph

# Install dependencies
pip install langgraph langsmith langchain-openai

# Import packages
from typing import TypedDict, Annotated
from langchain_openai import ChatOpenAI
from langgraph.graph.message import add_messages
from langgraph.graph import StateGraph

# Initialize LLM
llm = ChatOpenAI(model='gpt-4o', api_key='sk-xxxx', base_url='https://vip.apiyi.com/v1')

class State(TypedDict):
    messages: Annotated[list, add_messages]

def chat(state: State):
    response = llm.invoke(state["messages"])
    return {"messages": [response]}

workflow = StateGraph(State)
workflow.add_node(chat)
workflow.set_entry_point("chat")
workflow.set_finish_point("chat")
graph = workflow.compile()

# Interactive loop
while True:
    user_input = input("用户: ")
    if user_input.lower() == "exit":
        break
    for event in graph.stream({"messages": [("user", user_input)]}):
        for value in event.values():
            print("Assistant:", value["messages"][-1].content)

The generated graph shows a single node "chat" forming a closed loop.

LangGraph Translation Agent

The Translation Agent is re‑implemented as a three‑node LangGraph workflow:

initial_translation → reflect_on_translation → improve_translation

. The graph is visualized and saved as translation_workflow_graph.png.

ReAct Agent with LangGraph

A ReAct Agent that decides whether to call tools or finish is built using conditional edges. The core functions are:

def call_model(state: MessagesState):
    response = model_with_tools.invoke(state["messages"])
    return {"messages": [response]}

def should_continue(state: MessagesState) -> Literal["tools", "__end__"]:
    last_message = state["messages"][-1]
    if last_message.tool_calls:
        return "tools"
    return "__end__"

The workflow includes a ToolNode that executes the defined tools and loops back to the model until no tool calls remain.

Parallel Execution Example

LangGraph supports parallel branches. The following graph runs nodes b and c in parallel after a, then merges at d:

builder = StateGraph(State)
builder.add_node("a", ReturnNodeValue("I'm A"))
builder.add_node("b", ReturnNodeValue("I'm B"))
builder.add_node("c", ReturnNodeValue("I'm C"))
builder.add_node("d", ReturnNodeValue("I'm D"))
builder.add_edge(START, "a")
builder.add_edge("a", ["b", "c"])  # parallel
builder.add_edge(["b", "c"], "d")
builder.add_edge("d", END)
graph = builder.compile()

The execution log shows each node runs exactly once, demonstrating deterministic parallelism.

Building a Personal Ops Knowledge‑Base Agent

Why Build One?

Automate answers to routine ops questions, reducing manual effort.

Create a digital twin of your expertise for self‑service.

Capture institutional knowledge and make it searchable.

Implementation Approaches

Self‑hosted services (QAnything, RAGFlow, Open Web UI).

SaaS platforms (Coze, Dify).

Custom development with LangChain / LangGraph (focus of this guide).

Adaptive RAG + Web Search Agent

Workflow Overview

The agent first routes the user query to either the vector store (knowledge base) or a web search (Tavily). If the vector store is chosen, an adaptive RAG loop checks relevance, rewrites the query if needed, and grades the final answer for hallucinations and usefulness.

Vectorizing the Knowledge Base

from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain.text_splitter import MarkdownHeaderTextSplitter
import uuid, os

file_path = os.path.join('data', 'data.md')
with open(file_path, 'r', encoding='utf-8') as f:
    docs_string = f.read()
headers_to_split_on = [("#", "Header 1"), ("##", "Header 2"), ("###", "Header 3")]
text_splitter = MarkdownHeaderTextSplitter(headers_to_split_on=headers_to_split_on)
doc_splits = text_splitter.split_text(docs_string)
embedding = OpenAIEmbeddings(model='text-embedding-3-small', openai_api_key='sk-xxxx', openai_api_base='https://vip.apiyi.com/v1')
vectorstore = Chroma.from_documents(documents=doc_splits, embedding=embedding, persist_directory=str(uuid.uuid4()), collection_name='rag-chroma')
retriever = vectorstore.as_retriever()

Relevance Grader

from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
from langchain_core.pydantic_v1 import BaseModel, Field

class GradeDocuments(BaseModel):
    binary_score: str = Field(description="Documents are relevant to the question, 'yes' or 'no'")

llm = ChatOpenAI(model='gpt-4o-mini', api_key='sk-xxxx', base_url='https://vip.apiyi.com/v1', temperature=0)
structured_grader = llm.with_structured_output(GradeDocuments)
system = """You are a grader. Decide if a retrieved document is relevant to the user question. Answer 'yes' or 'no'."""
grade_prompt = ChatPromptTemplate.from_messages([
    ("system", system),
    ("human", "Retrieved document: {document}
User question: {question}")
])
retrieval_grader = grade_prompt | structured_grader

Answer Generation

from langchain import hub
from langchain_core.output_parsers import StrOutputParser

prompt = hub.pull("rlm/rag-prompt")
llm = ChatOpenAI(model='gpt-4o-mini', api_key='sk-xxxx', base_url='https://vip.apiyi.com/v1', temperature=0)
rag_chain = prompt | llm | StrOutputParser()

Hallucination and Answer Graders

# Hallucination grader
class GradeHallucinations(BaseModel):
    binary_score: str = Field(description="Answer is grounded in the facts, 'yes' or 'no'")
structured_hallucination = llm.with_structured_output(GradeHallucinations)
hallucination_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a grader. Does the answer rely on the provided documents? Answer 'yes' or 'no'."),
    ("human", "Documents: {documents}
Answer: {generation}")
])
hallucination_grader = hallucination_prompt | structured_hallucination

# Answer usefulness grader
class GradeAnswer(BaseModel):
    binary_score: str = Field(description="Answer addresses the question, 'yes' or 'no'")
structured_answer = llm.with_structured_output(GradeAnswer)
answer_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a grader. Does the answer solve the user question? Answer 'yes' or 'no'."),
    ("human", "Question: {question}
Answer: {generation}")
])
answer_grader = answer_prompt | structured_answer

Query Rewriter

# Re‑write poorly matched queries
system = "You are a query rewriter. Produce a better version of the user question for vector retrieval."
rewrite_prompt = ChatPromptTemplate.from_messages([
    ("system", system),
    ("human", "Here is the initial question: {question}. Formulate an improved question.")
])
question_rewriter = rewrite_prompt | llm | StrOutputParser()

Router Between Vector Store and Web Search

from langchain_core.pydantic_v1 import BaseModel, Field
from typing import Literal

class RouteQuery(BaseModel):
    datasource: Literal["vectorstore", "web_search"] = Field(description="Route the question to either vectorstore or web_search.")

router_llm = ChatOpenAI(model='gpt-4o-mini', api_key='sk-xxxx', base_url='https://vip.apiyi.com/v1', temperature=0)
structured_router = router_llm.with_structured_output(RouteQuery)
system = """You are an expert router. If the question is about ops, services, logs, etc., route to vectorstore; otherwise route to web_search."""
router_prompt = ChatPromptTemplate.from_messages([
    ("system", system),
    ("human", "{question}")
])
question_router = router_prompt | structured_router

Web Search Node (Tavily)

from langchain_community.tools.tavily_search import TavilySearchResults
web_search_tool = TavilySearchResults(k=3)

def web_search(state: dict):
    question = state["question"]
    docs = web_search_tool.invoke({"query": question})
    web_content = "
".join([d["content"] for d in docs])
    return {"documents": web_content, "question": question}

LangGraph Construction

from langgraph.graph import StateGraph, START, END

class GraphState(TypedDict):
    question: str
    documents: list
    generation: str

workflow = StateGraph(GraphState)
workflow.add_node("web_search", web_search)
workflow.add_node("retrieve", retrieve)
workflow.add_node("grade_documents", grade_documents)
workflow.add_node("generate", generate)
workflow.add_node("transform_query", transform_query)

# Conditional routing from start
workflow.add_conditional_edges(
    START,
    route_question,
    {"web_search": "web_search", "vectorstore": "retrieve"}
)
workflow.add_edge("web_search", "generate")
workflow.add_edge("retrieve", "grade_documents")
workflow.add_conditional_edges(
    "grade_documents",
    decide_to_generate,
    {"transform_query": "transform_query", "generate": "generate"}
)
workflow.add_edge("transform_query", "retrieve")
workflow.add_conditional_edges(
    "generate",
    grade_generation_v_documents_and_question,
    {"not supported": "generate", "useful": END, "not useful": "transform_query"}
)
app = workflow.compile()

Running the Agent

inputs = {"question": "谁管的系统最多？"}
for output in app.stream(inputs):
    for key, value in output.items():
        print(f"Node '{key}':")
        if "generation" in value:
            print(value["generation"])
        print("---")

Conclusion

This article provides a thorough exploration of AI agents, covering their definition, core capabilities, design patterns, and step‑by‑step implementations of RAG, Translation, and ReAct agents using LangGraph. It also demonstrates how to build a personal ops knowledge‑base agent that intelligently switches between vector‑store retrieval and web search, handling relevance grading, query rewriting, hallucination detection, and answer validation.

By leveraging LangGraph’s stateful graphs, conditional edges, and parallel execution, developers can construct sophisticated, loop‑enabled agents with clear visual workflows, making complex AI applications easier to design, debug, and extend.

LLM RAG LangGraph

Written by

Ops Development Stories

Maintained by a like‑minded team, covering both operations and development. Topics span Linux ops, DevOps toolchain, Kubernetes containerization, monitoring, log collection, network security, and Python or Go development. Team members: Qiao Ke, wanger, Dong Ge, Su Xin, Hua Zai, Zheng Ge, Teacher Xia.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.