Artificial Intelligence 24 min read

Why AI Agents Get Dumber Over Time? ICML 2026 Theory of Agent Explains

The article introduces the ICML 2026 Theory of Agent (ToA), analyzes four common failure modes of modern agents, explains the internal‑vs‑external tool trade‑off through a knowledge‑boundary framework, and outlines how effort‑conservation and the β parameter guide self‑evolving agent design and future research.

Data Party THU

May 31, 2026

Why AI Agents Get Dumber Over Time? ICML 2026 Theory of Agent Explains

Overview

The authors present the Theory of Agent (ToA), a unified framework for AI agents proposed jointly by the University of Edinburgh, Princeton, UIUC, Northwestern, and CUHK and accepted as an ICML 2026 position paper. ToA reframes agent research from an engineering contest into a scientific discipline that asks not only whether an agent works, but why it works and when it should work.

Motivating Analogy

Using a simple exam scenario, the article contrasts two students: Student A solves problems by recalling knowledge and reasoning (closed‑book), while Student B repeatedly looks up answers (open‑book). Both achieve perfect scores, yet after a semester Student A’s “problem‑solving intuition” improves, whereas Student B’s knowledge does not grow, illustrating the long‑term divergence between internal reasoning and external tool reliance.

Four Failure Modes

The authors identify four symptom categories that all stem from a single underlying decision: whether to continue internal reasoning or to delegate to an external tool. These are under‑thinking, over‑thinking, under‑acting, and over‑acting. Current research typically patches each symptom separately (e.g., length penalties for over‑reasoning, action‑budget limits for tool overuse), but ToA argues that they are manifestations of a shared mis‑allocation of epistemic effort.

Internal vs. External Tools

ToA defines two tool families:

Internal cognitive tools (chain‑of‑thought, reflection, decomposition) that reorganize information already present in the model.

External physical tools (search, API calls, UI actions, code execution) that inject information the model does not possess.

Both reduce epistemic uncertainty; the difference lies in the source of information. Figure 1 (included) visualizes how the same correct answer can arise from either an agent that over‑relies on external tools or one that maximizes internal reasoning.

Knowledge Boundary and Effort Conservation

ToA introduces the knowledge boundary that separates the internal task set (tasks solvable by pure reasoning) from the world task set (all tasks the environment presents). The gap between them represents tasks that truly require external tools. The authors further propose an effort‑conservation principle: the total epistemic effort required to solve a task is fixed; agents merely redistribute this effort between internal reasoning (E_int) and external action (E_ext). This is formalized as β·E_int + E_ext = E*, where β captures the relative cost of internal reasoning versus external calls.

When β is large (internal reasoning is expensive), the optimal allocation shifts toward external tools (the “small model + strong toolchain” regime). When β is small (external calls are costly), agents favor internal reasoning (the “large model + self‑contained reasoning” regime). This connects ToA to resource‑bounded rationality.

Self‑Evolving Agents

An agent is deemed self‑evolving if its internal task set expands over time. In static worlds this expansion is a coverage problem—agents gradually internalize tasks previously outsourced. In dynamic worlds the expansion must outpace the rate at which new tasks appear, leading to a quantitative inequality that governs whether the knowledge boundary can keep moving.

Training Directions

The paper outlines four complementary training pathways to address the “only‑correctness” pathology:

Agentic Post‑training (Next‑Tool Prediction) : extend next‑token pre‑training to predict the next tool, turning interaction traces into first‑class modeling targets.

Agentic SFT (Ability‑Specific Supervision) : tailor supervised fine‑tuning datasets to each model’s internal solvability (Q_int), avoiding a one‑size‑fits‑all tool‑use standard.

Agentic RL (Process‑Level Rewards) : reward not only correct answers but also the process (when to think vs. when to act), e.g., the OTC‑PO method penalizes unnecessary tool calls.

Agentic Prompting : use ReAct‑style scaffolds and memory prompts to elicit complex tool‑use without changing parameters, while acknowledging their limited ability to evaluate decision quality.

These pathways are envisioned to iterate (RL → SFT → RL) to gradually align agents with effort‑consistent behavior.

Open Research Questions

How to reliably estimate the internal solvability function Q_int(m, W) for a given model and environment?

How to train agents that respect the effort‑conservation invariant, given that standard RL only observes outcomes?

How to benchmark effort allocation rather than mere answer correctness, distinguishing internal reasoning from external outsourcing?

Implications for Long Context vs. Retrieval‑Augmented Generation

Under ToA, long‑context scaling and RAG are two points on the same continuum: long context pushes information into the model (internal), while RAG fetches it on demand (external). The authors argue that, when accuracy is equal, long‑context is generally preferable because it internalizes knowledge, whereas RAG remains valuable for dynamic or massive knowledge sources.

Conclusion

ToA reframes agents as systems that allocate a fixed budget of epistemic effort between internal reasoning and external action. By making the knowledge boundary explicit and introducing the β parameter, the framework unifies disparate failure modes, guides training strategies, and sets a research agenda for truly self‑evolving, increasingly autonomous agents.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AI Agents Tool Use self‑evolution ICML 2026 resource-bounded rationality Theory of Agent

Written by

Data Party THU

Official platform of Tsinghua Big Data Research Center, sharing the team's latest research, teaching updates, and big data news.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.