Mastering Structured Output for DeepSeek‑R1 with LangChain, LangGraph, and ReAct Agents
DeepSeek‑R1 excels at deep reasoning but lacks native structured output; this guide explains why structured output matters, outlines common API‑level techniques, and provides three practical solutions—using an auxiliary model with a LangChain chain, a LangGraph workflow, and a ReAct agent—complete with code snippets and JSON‑mode tips.
Understanding Structured Output
Structured output means the LLM returns its response in a well‑defined, machine‑readable format such as JSON, rather than free‑form text. A typical example is a JSON object describing a user profile with fields like name, age, and email.
Why Structured Output Is Needed
Returning data in a standardized format simplifies downstream processing: it can be directly inserted into relational databases, used to auto‑fill web forms, or passed to external APIs. In enterprise‑grade, multi‑step Agent workflows, this reduces uncertainty and eases integration.
Import extracted information into a relational database
Automate filling of web forms
Integrate with external or corporate APIs
Common Ways to Enforce Structured Output
Many modern LLM APIs support structured output natively. Apart from function calling, two popular mechanisms are:
JSON mode : set the API response_format to json_object and guide the model with a prompt.
JSON Schema / Pydantic : pass a schema or a Pydantic model to the API so the model returns an object that matches the definition without extra parsing.
Method 1 – Using an Auxiliary Model with a LangChain Chain
The idea is to let DeepSeek‑R1 generate a raw outline, then pass that text to a second model (e.g., GPT‑4o) that enforces a structured schema.
# deepseek prompt
system_prompt_deepseek = ChatPromptTemplate.from_template("""
针对用户提供的主题, 撰写一篇研究报告的大纲。要求全面且具体。我的主题:{topic}
""")
# formatting prompt
system_prompt_format = ChatPromptTemplate.from_template("""
你是一个内容格式化专家, 只负责将输入内容转化为结构化输出, 不要做额外思考。我的内容:{draft}
""")
llm_ollama_deepseek = ChatOllama(model='deepseek-r1:1.5b')
llm_openai = ChatOpenAI(model='gpt-4o-mini').with_structured_output(Outline)
# build chain
chain = system_prompt_deepseek | llm_ollama_deepseek | system_prompt_format | llm_openaiRunning the chain yields a clean Outline object that can be serialized directly.
response = await chain.ainvoke({"topic": "deepseek-r1的技术原理与创新分析"})
print("
生成的大纲:")
print(response.as_str)Method 2 – Building a LangGraph Workflow
For truly streaming or multi‑step tasks, a LangGraph can orchestrate the same two‑model pattern with explicit state nodes.
class State(TypedDict):
"""Graph state type"""
topic: str
draft: str | None
output: Outline | None
def deepseek_node(state: State) -> State:
"""Use DeepSeek‑R1 to generate the raw outline"""
ollama = ChatOllama(model="deepseek-r1:1.5b")
messages = [SystemMessage(content="针对用户提供的主题,撰写一篇研究报告的大纲。要求全面且具体"),
HumanMessage(content=state["topic"])]
response = ollama.invoke(messages)
state["draft"] = response.content
return state
def formatter_node(state: State) -> State:
"""Use GPT‑4o to format the draft into a structured Outline"""
llm = ChatOpenAI(model="gpt-4o-mini").with_structured_output(Outline)
messages = [SystemMessage(content="你是一个内容格式化专家,只负责将输入内容转化为结构化输出,不要做额外思考"),
HumanMessage(content=f"我的内容: {state['draft']}")]
response = llm.invoke(messages)
state["output"] = response
return stateMethod 3 – Using a ReAct Agent
The ReAct pattern lets an agent decide when to call a tool (the DeepSeek‑R1 reasoning step) and then format the result with a second model.
@tool
def deepseek_research(prompt: str) -> str:
"""Generate a research outline draft using DeepSeek‑R1"""
print('start think...')
messages = [HumanMessage(content=f'针对用户提供的主题,撰写一篇研究报告的大纲。要求全面且具体。我的主题:{prompt}')]
result = llm_ollama_deepseek.invoke(messages)
print(result.content)
return result.content
llm = ChatOpenAI(model='gpt-4o-mini')
agent = create_react_agent(llm, response_format=Outline, tools=[deepseek_research])
response = agent.invoke({"messages": [
SystemMessage(content="请首先使用工具(deepseek_research)来生成研究大纲草稿,然后对草稿进行格式化输出。请注意把输入主题直接交给工具,不要做思考与加工。"),
HumanMessage(content="deepseek-r1技术原理与创新分析")
]})
last_message = response["structured_response"]
print(last_message.as_str)Handling Models Without Native Structured‑Output Support
If the auxiliary model does not provide a JSON‑schema interface (e.g., Qwen), you can fall back to JSON mode.
response_format = {"type": "json_object"}Prompt the model to return a JSON string that matches the desired schema, then parse it in your code.
json_object = json.loads(result)By combining these techniques, developers can reliably obtain structured data from DeepSeek‑R1 and integrate it into complex enterprise applications.
AI Large Model Application Practice
Focused on deep research and development of large-model applications. Authors of "RAG Application Development and Optimization Based on Large Models" and "MCP Principles Unveiled and Development Guide". Primarily B2B, with B2C as a supplement.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
