Artificial Intelligence 12 min read

Mastering Structured Output for DeepSeek‑R1 with LangChain, LangGraph, and ReAct Agents

DeepSeek‑R1 excels at deep reasoning but lacks native structured output; this guide explains why structured output matters, outlines common API‑level techniques, and provides three practical solutions—using an auxiliary model with a LangChain chain, a LangGraph workflow, and a ReAct agent—complete with code snippets and JSON‑mode tips.

AI Large Model Application Practice

Feb 17, 2025

Mastering Structured Output for DeepSeek‑R1 with LangChain, LangGraph, and ReAct Agents

Understanding Structured Output

Structured output means the LLM returns its response in a well‑defined, machine‑readable format such as JSON, rather than free‑form text. A typical example is a JSON object describing a user profile with fields like name, age, and email.

Why Structured Output Is Needed

Returning data in a standardized format simplifies downstream processing: it can be directly inserted into relational databases, used to auto‑fill web forms, or passed to external APIs. In enterprise‑grade, multi‑step Agent workflows, this reduces uncertainty and eases integration.

Import extracted information into a relational database

Automate filling of web forms

Integrate with external or corporate APIs

Common Ways to Enforce Structured Output

Many modern LLM APIs support structured output natively. Apart from function calling, two popular mechanisms are:

JSON mode : set the API response_format to json_object and guide the model with a prompt.

JSON Schema / Pydantic : pass a schema or a Pydantic model to the API so the model returns an object that matches the definition without extra parsing.

Method 1 – Using an Auxiliary Model with a LangChain Chain

The idea is to let DeepSeek‑R1 generate a raw outline, then pass that text to a second model (e.g., GPT‑4o) that enforces a structured schema.

# deepseek prompt
system_prompt_deepseek = ChatPromptTemplate.from_template("""
针对用户提供的主题, 撰写一篇研究报告的大纲。要求全面且具体。我的主题：{topic}
""")

# formatting prompt
system_prompt_format = ChatPromptTemplate.from_template("""
你是一个内容格式化专家, 只负责将输入内容转化为结构化输出, 不要做额外思考。我的内容：{draft}
""")

llm_ollama_deepseek = ChatOllama(model='deepseek-r1:1.5b')
llm_openai = ChatOpenAI(model='gpt-4o-mini').with_structured_output(Outline)

# build chain
chain = system_prompt_deepseek | llm_ollama_deepseek | system_prompt_format | llm_openai

Running the chain yields a clean Outline object that can be serialized directly.

response = await chain.ainvoke({"topic": "deepseek-r1的技术原理与创新分析"})
print("
生成的大纲：")
print(response.as_str)

Method 2 – Building a LangGraph Workflow

For truly streaming or multi‑step tasks, a LangGraph can orchestrate the same two‑model pattern with explicit state nodes.

class State(TypedDict):
    """Graph state type"""
    topic: str
    draft: str | None
    output: Outline | None

def deepseek_node(state: State) -> State:
    """Use DeepSeek‑R1 to generate the raw outline"""
    ollama = ChatOllama(model="deepseek-r1:1.5b")
    messages = [SystemMessage(content="针对用户提供的主题，撰写一篇研究报告的大纲。要求全面且具体"),
                HumanMessage(content=state["topic"])]
    response = ollama.invoke(messages)
    state["draft"] = response.content
    return state

def formatter_node(state: State) -> State:
    """Use GPT‑4o to format the draft into a structured Outline"""
    llm = ChatOpenAI(model="gpt-4o-mini").with_structured_output(Outline)
    messages = [SystemMessage(content="你是一个内容格式化专家,只负责将输入内容转化为结构化输出，不要做额外思考"),
                HumanMessage(content=f"我的内容: {state['draft']}")]
    response = llm.invoke(messages)
    state["output"] = response
    return state

Method 3 – Using a ReAct Agent

The ReAct pattern lets an agent decide when to call a tool (the DeepSeek‑R1 reasoning step) and then format the result with a second model.

@tool
def deepseek_research(prompt: str) -> str:
    """Generate a research outline draft using DeepSeek‑R1"""
    print('start think...')
    messages = [HumanMessage(content=f'针对用户提供的主题，撰写一篇研究报告的大纲。要求全面且具体。我的主题：{prompt}')]
    result = llm_ollama_deepseek.invoke(messages)
    print(result.content)
    return result.content

llm = ChatOpenAI(model='gpt-4o-mini')
agent = create_react_agent(llm, response_format=Outline, tools=[deepseek_research])
response = agent.invoke({"messages": [
    SystemMessage(content="请首先使用工具(deepseek_research)来生成研究大纲草稿,然后对草稿进行格式化输出。请注意把输入主题直接交给工具，不要做思考与加工。"),
    HumanMessage(content="deepseek-r1技术原理与创新分析")
]})
last_message = response["structured_response"]
print(last_message.as_str)

Handling Models Without Native Structured‑Output Support

If the auxiliary model does not provide a JSON‑schema interface (e.g., Qwen), you can fall back to JSON mode.

response_format = {"type": "json_object"}

Prompt the model to return a JSON string that matches the desired schema, then parse it in your code.

json_object = json.loads(result)

By combining these techniques, developers can reliably obtain structured data from DeepSeek‑R1 and integrate it into complex enterprise applications.

LLM LangChain DeepSeek LangGraph structured output ReAct Agent

Written by

AI Large Model Application Practice

Focused on deep research and development of large-model applications. Authors of "RAG Application Development and Optimization Based on Large Models" and "MCP Principles Unveiled and Development Guide". Primarily B2B, with B2C as a supplement.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.