25 min read

How Core Agent Concepts and Paradigms Have Evolved and the Rationale Behind Them

The article traces the evolution of AI agents from early ReAct‑style models through workflow‑based systems to autonomous and self‑evolving agents, analyzing six core dimensions—Prompt, Planning, Memory, Tools, Workflow, and Environment—and explains why each paradigm shift occurred, citing recent frameworks and research.

Alibaba Cloud Developer

May 22, 2026

How Core Agent Concepts and Paradigms Have Evolved and the Rationale Behind Them

In the past few years the rapid upgrade of large‑model capabilities has sparked an explosion of Agent technologies such as Cloud Code, Codex, OpenClaw and Hermes. The author observes that many practitioners still mix early‑stage concepts with the newest ones, leading to confusion. To clarify the landscape, the article systematically reviews the evolution of Agent paradigms.

Four development stages

Stage 1 – Early ReAct agents (2023, “enlightenment” period) : Based on Lilian Weng’s LLM Powered Autonomous Agents paper, the architecture combines LLM, planning, tools and memory. Interaction is essentially a one‑turn “question‑answer” or “instruction‑execution” loop, limited to a few reasoning steps and short tool chains.

Stage 2 – Workflow agents (2024) : To satisfy B‑to‑B stability requirements, pure ReAct is insufficient. Frameworks like LangGraph and Dify introduce hard engineering constraints, turning the agent into a structured workflow where the LLM is embedded at key nodes.

Stage 3 – Autonomous agents (2025) : With models such as Manus, Claude Code and Codex, agents gain true planning abilities—decomposing vague goals into sub‑tasks, generating todo‑lists, iterating over multiple steps, and self‑verifying results.

Stage 4 – Self‑evolving agents (2026 onward) : New frameworks (e.g., Hermes, OpenClaw) enable agents to persist skills, maintain a knowledge base, and even improve via reinforcement‑learning loops, turning them from disposable tools into long‑lived digital employees.

Six core technical dimensions and their evolution

Prompt : Early agents required a monolithic system prompt for each task, leading to high maintenance cost. Modern practice separates stable system‑level instructions from dynamic task‑specific data, loading the latter progressively from markdown files (e.g., SKILL.md, USER.md) – a “deep‑coupling → progressive loading” shift.

Planning : Initially a simple CoT chain (“Let’s think step by step”), now a sophisticated planner that structurally decomposes complex goals, creates sub‑tasks, and can instantiate sub‑agents on‑the‑fly. This leap is driven by the upgraded reasoning ability of base models.

Memory : Short‑term memory moved from raw dialogue logs to compressed, token‑efficient representations (threshold‑based pruning, structured summaries, key‑fact extraction). Long‑term memory transitioned from pure vector‑store retrieval to a hybrid of file‑system‑based episodic logs (e.g., MEMORY.md) and semantic stores (local notebooks, SQLite, or vector DBs).

Tools : Early agents used API‑based function calls, incurring high development overhead. The new paradigm embraces native CLI commands and script execution, leveraging the model’s pre‑trained knowledge of Unix tools ( grep, cat, vim) and wrapping them as reusable Skills . Scripts can be local or remote, and are described in markdown to enable zero‑shot usage.

Workflow : Rigid, hard‑coded pipelines gave way to dynamic skill‑driven compositions. Core logic is now expressed in markdown skill files, while optional fixed workflows remain for high‑stability paths, yielding a “skill‑first, workflow‑as‑fallback” hybrid architecture.

Environment (Runtime) : Agents have progressed from stateless calls to stateful, sandboxed runtimes. Two main deployment shapes exist: a local desktop workspace for personal automation (high flexibility, low safety) and container‑based sandboxes (Docker/K8s) for enterprise use, providing isolation and resource control.

The article concludes that, although the high‑level modules (Prompt, Planning, Memory, Tools, Workflow, Environment) remain recognizable from Lilian Weng’s original framework, their internal implementations have been fundamentally re‑engineered. The shift from “model‑only magic” to “engineered determinism + model uncertainty” marks the maturation of Agent technology and provides a roadmap for building robust, scalable agents.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Memory Management AI agents Prompt engineering tool integration workflow orchestration planning Self-Evolving Systems

Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.