How ReAct (Reasoning + Acting) Empowers LLM Agents to Solve Real‑World Tasks

This article explains the ReAct paradigm—combining reasoning, action, and observation—to turn large language models into controllable agents, detailing its core concepts, architecture, workflow, code implementation, application scenarios, advantages over other methods, and future research directions.

Tencent Cloud Developer
Tencent Cloud Developer
Tencent Cloud Developer
How ReAct (Reasoning + Acting) Empowers LLM Agents to Solve Real‑World Tasks

What is ReAct?

ReAct (Reasoning + Acting) is a paradigm that enables large language models (LLMs) to solve complex tasks by iteratively performing a Thought → Act → Observe (TAO) cycle. The model generates an explicit reasoning step (Thought), selects a tool and formats a standardized action (Act), receives a structured observation (Observe) from the tool, and repeats until a finish[ … ] action is emitted.

ReAct overview diagram
ReAct overview diagram

Design Principles

Environment anchoring : factual queries must be answered via external tools, preventing hallucinations.

Explainability first : each Thought must contain task status, action purpose and expected result.

Modular decoupling : reasoning, action planning and loop control are separate modules, allowing tool‑set replacement without code changes.

Fault tolerance : automatic retries, error handling and context pruning keep the loop robust.

ReAct Workflow

Initialization : parse the natural‑language task, load 1‑3 few‑shot examples (Task‑Thought‑Act‑Observe‑Result) and create a ContextManager to store TAO triples.

Iterative TAO loop : for each step the LLM generates a Thought and an Action. The Action is validated, routed to the corresponding tool, and the tool’s result becomes the Observation. The triple is appended to the context; when the accumulated context exceeds the model’s token limit the manager retains the most recent three steps and a concise summary of earlier steps.

Termination : the loop stops when the model emits a finish[ … ] action, when a maximum step count (typically 5‑10) is reached, or after repeated tool failures. The final result and the full execution trace are returned.

TAO loop diagram
TAO loop diagram

Technical Architecture

ReAct is organized into three layers.

Core Logic Layer : an LLM plus prompt engineering produces Thoughts and formats Actions.

Execution Loop Layer : a ContextManager stores TAO history, an ActionParser validates and extracts tool calls, and a Scheduler decides whether to continue or terminate.

External Interaction Layer : a set of standardized tools (search, data processing, service booking, device control) expose a run(params) method and return structured observations.

Key Implementation Details (Python)

from typing import Any, List

class BaseTool:
    """Standard tool interface"""
    def __init__(self, name: str, description: str):
        self.name = name
        self.description = description
    def run(self, params: Any) -> str:
        raise NotImplementedError("Tool must implement run()")

class FlightSearchTool(BaseTool):
    def __init__(self):
        super().__init__(name="flight_search",
                         description="查询航班,参数 format: '出发地,目的地,日期,时段'")
    def run(self, params: str) -> str:
        try:
            dep, arr, date, period = params.split(',')
            flight_map = {
                "深圳,海南,明天,晚上": "符合条件航班列表:1. HU7089(票价480元)"
            }
            return flight_map.get(f"{dep},{arr},{date},{period}",
                                 f"未检索到 {dep} 到 {arr} 的航班信息")
        except Exception as e:
            return f"查询失败:{str(e)[:50]}"

class FlightBookTool(BaseTool):
    def __init__(self):
        super().__init__(name="flight_book",
                         description="预订航班,参数 format: '航班号,乘客姓名,身份证号'")
    def run(self, params: str) -> str:
        try:
            flight_no, name, id_card = params.split(',')
            return f"航班 {flight_no} 已为 {name} 预订,身份证后四位 {id_card[-4:]}"
        except Exception as e:
            return f"预订失败:{str(e)[:50]}"

class ContextManager:
    def __init__(self, max_length: int = 4000):
        self.max_length = max_length
        self.tao_trajectory = []
    def add_tao(self, thought: str, action: str, observation: str) -> None:
        self.tao_trajectory.append({"thought": thought,
                                   "action": action,
                                   "observation": observation})
        self._prune()
    def _prune(self):
        if len(str(self.tao_trajectory)) <= self.max_length:
            return
        recent = self.tao_trajectory[-3:]
        early = [t["action"] for t in self.tao_trajectory[:-3]][:2]
        summary = f"早期行动:{', '.join(early)}..."
        self.tao_trajectory = [{"thought": "【早期摘要】",
                                "action": "",
                                "observation": summary}] + recent
    def get_context(self) -> str:
        if not self.tao_trajectory:
            return "无历史轨迹"
        return "
".join([f"步骤{i+1}:思考:{t['thought']} | 行动:{t['action']} | 观察:{t['observation']}"
                          for i, t in enumerate(self.tao_trajectory)])

def react_core_loop(task: str, tools: List[BaseTool], max_steps: int = 6):
    ctx = ContextManager()
    tool_map = {t.name: t for t in tools}
    for step in range(max_steps):
        # Prompt construction (simplified for illustration)
        tool_desc = "
".join([f"- {n}: {t.description}" for n, t in tool_map.items()])
        prompt = f"任务:{task}
历史:{ctx.get_context()}
可用工具:
{tool_desc}
请输出思考和行动。"
        # Simulated LLM output for demo purposes
        if step == 0:
            llm_output = "思考:需要查询航班。行动:flight_search[深圳,海南,明天,晚上]"
        elif step == 1:
            llm_output = "思考:已得到航班列表,选择最便宜的。行动:flight_book[HU7089,李四,123456199505056789]"
        else:
            llm_output = "思考:任务完成。行动:finish[已预订最便宜的晚上航班]"
        thought = llm_output.split("思考:")[1].split("行动:")[0].strip()
        action = llm_output.split("行动:")[1].strip()
        if action.startswith("finish["):
            result = action[7:-1]
            return result, ctx.get_context()
        tool_name = next((n for n in tool_map if action.startswith(n)), None)
        if tool_name:
            params = action[len(tool_name)+1:-1]
            observation = tool_map[tool_name].run(params)
        else:
            observation = f"无效行动:{action}"
        ctx.add_tao(thought, action, observation)
    return "未完成(超步数)", ctx.get_context()

Typical Applications

Knowledge‑intensive tasks : multi‑hop QA, fact‑checking, literature retrieval.

Interactive decision making : itinerary planning, e‑commerce shopping, schedule optimization.

Intelligent customer service : personalized advice, troubleshooting, health guidance.

Embodied intelligence : household robots, assembly‑line automation, autonomous driving.

Advantages over Prior Methods

Strong reasoning‑action synergy.

Effective hallucination suppression via external grounding.

Explicit step‑by‑step explainability.

Modular tool replacement enables rapid adaptation to new domains.

Low deployment cost: few‑shot prompting without model fine‑tuning.

Limitations and Future Directions

Two main limitations are identified:

Context‑window constraints: long TAO sequences require aggressive summarization, which may discard essential logical information.

Action selection relies solely on LLM output; without quantitative reward feedback the system can issue redundant or sub‑optimal tool calls.

Potential research directions include:

Integrating reinforcement‑learning reward signals to guide action selection and reduce unnecessary tool invocations.

Connecting external memory stores (vector databases, knowledge graphs) to extend effective context length.

Improving error‑handling policies and dynamic step‑budget allocation.

References

[1] ReAct: Synergizing Reasoning and Acting in Language Models , arXiv:2210.03629.

[2] ReAct Project Homepage, https://react-lm.github.io

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AI automationLLM agentsreasoning and acting
Tencent Cloud Developer
Written by

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.