From Agentic Tools to Agentive Systems: A Review of “Critique of Agent Model”
The paper distinguishes agentic tools that rely on external scaffolding from truly agentive systems whose goals, identity, decision‑making, self‑regulation and learning are internalized, proposes the GIC (Goal‑Identity‑Configurator) architecture, and evaluates its safety, auditability and applicability through a pilot‑training use case.
1 Introduction
The authors start from the fundamental question: where does automation end and agency begin? Modern LLM agents can call tools, write code and plan steps, but the paper argues that these capabilities alone do not confer genuine agency.
Current agentic systems depend on external engineering: humans supply goals, prompts fix identity, workflows dictate steps, and training pipelines are human‑controlled. The authors claim that such systems exhibit task performance that stems from engineered orchestration rather than an internal autonomous structure.
In contrast, biological intelligence exhibits layered abilities—language, physical interaction, social understanding, moral drive, and personal growth—suggesting that a truly open‑world general agent must model, revise and regulate these structures internally.
2 The Boundary Between Agentic and Agentive Systems
Two Classes of Systems
The paper defines a spectrum: at one end are agentic systems whose structure and process are defined by external scaffolding; at the other end are agentive systems that maintain goals, identity, decision‑making, self‑regulation and learning as internal, updatable components.
Agentic examples include coding agents that invoke editors, test runners and bug fixers, or research agents that retrieve literature and generate hypotheses. These rely on external orchestration—each step is prescribed by prompts, tools and controllers—so they lack long‑term goals and a self‑model.
Agentive systems, by contrast, must generate and maintain long‑term goals, evolve identity, simulate action consequences, decide when deep reasoning is needed, and learn from experience.
Five Distinguishing Dimensions
Goal: short‑term instruction vs. hierarchical long‑term goal decomposition.
Identity: fixed prompt description vs. adaptive self‑model that updates with experience.
Decision‑making: end‑to‑end black‑box policy vs. world‑model‑based simulation and critic evaluation.
Self‑regulation: static planning at every step vs. learned scheduler that chooses when to plan, react or learn.
Learning: human‑curated training pipelines vs. autonomous improvement from real and simulated experience.
3 Current Landscape and Critique
Common Problems of Existing Approaches
The survey lists current agentic systems—tool‑calling agents, multi‑agent workflows, code agents, web agents, robotic agents and various LLM‑based planning frameworks. All externalize key capabilities: goals are user‑provided, identity is hard‑coded in prompts, planning is scripted, and learning occurs in offline pipelines.
While this externalization improves reliability and debuggability, it makes the systems brittle in open‑world settings where tasks, environments or social contexts change.
Why Scaling End‑to‑End Strategies Is Insufficient
The authors reject the view that simply enlarging models, adding compute or training with more RL data will automatically yield grounded planning. Hidden activations or chain‑of‑thought tokens may produce useful intermediate representations, but they do not constitute world‑model‑based reasoning.
Robust planning requires verifiable prediction of action outcomes, a world model, a critic and reliability estimates; text‑only narrative reasoning cannot guarantee decisions based on true environment dynamics.
4 Five Dimensions of Agency
Goal: From Short‑Term Instructions to Hierarchical Goals
Agentic systems typically receive a short‑term goal \(g_t\) at each step, which disappears after execution. The paper proposes that an agentive model should receive a long‑term overall goal \(g\) and use a goal‑decomposition module \(\delta\) to generate executable sub‑goals.
This decomposition is not a fixed script but a learnable decision process that continuously selects, orders and revises sub‑goals based on state, world‑model predictions and execution feedback.
Identity: From Prompt Constraints to Adaptive Self‑Model
Identity is more than a static prompt such as “you are a helpful assistant”. The authors define identity as the agent’s self‑model of its abilities, constraints, roles, relationships and commitments, which influences what it pursues, avoids and trusts.
Fixed identity aids controllability but cannot adapt to new capabilities or tasks. The paper argues for an identity that evolves with experience while remaining auditable.
Decision‑Making: From Reactive Policies to Simulation‑Based Reasoning
The paper distinguishes System I (fast reactive policies) from System II (world‑model‑based simulation). System II first simulates candidate trajectories, then a critic evaluates long‑term value before selecting an action.
Open‑world agents must learn when reactive responses suffice and when deep simulation is required; without a grounded world model, long‑horizon textual reasoning is merely plausible narrative.
Self‑Regulation: From Fixed Planning to Learned Scheduling
Fixed Model‑Predictive Control (MPC) either over‑computes in simple scenarios or under‑computes in complex ones. The authors introduce a learned configurator \(\kappa\) that decides at each timestep whether to create a new plan, continue an existing one, act directly, or invoke learning routines.
This makes the “when to think and how deep to think” an internal capability, allowing dynamic trade‑offs among resources, risk and task difficulty.
Learning: From Human‑Curated Training to Self‑Improvement
Current agents are trained via human‑designed pipelines—RL in simulators, supervised data, frozen deployment models. Such pipelines limit the agent’s ability to decide when to learn, from which experiences, or how to supplement real data with simulated data.
The paper proposes that an agentive system should autonomously learn from both real interactions and simulated experiences generated by its world model, making learning a lifelong internal routine.
5 The GIC Agent Model
Overall Architecture
Building on the analysis, the authors present the GIC (Goal‑Identity‑Configurator) architecture, comprising six components: belief encoder, long‑term goals, sub‑goals, identity, configurator, and a planner/world‑model/policy/critic bundle.
The execution flow is: the Universe provides observations → belief encoder creates an internal belief state → long‑term goals are hierarchically decomposed into sub‑goals → identity evolves with experience and influences behavior → configurator decides whether to invoke the planner or act directly → the planner, using the world model, simulates candidate trajectories, the critic evaluates long‑term value, and the policy executes the chosen action.
Pilot‑Training Use Case
The paper illustrates GIC with a “train‑pilot” scenario. Low‑level actions involve controlling the aircraft; mid‑level tasks handle weather, routes and emergencies; high‑level responsibilities include multi‑day mission planning and team coordination. No single fixed strategy can cover all levels, and GIC aims to unify short‑term reaction, long‑term planning, identity evolution and self‑learning within one architecture.
Training, Deployment and Evaluation
Different GIC components use distinct training signals: the world model via generative prediction, the policy via supervised, RL or simulated experience, the configurator learns when to plan versus act, identity updates from self‑behavior and external feedback, and goal decomposition learns hierarchical task structures.
Evaluation must go beyond task success rate and test each capability separately: goal decomposition, identity adaptation, simulation‑based reasoning reliability, self‑regulation strategy, learning efficiency, open‑world generalisation and safety constraints.
6 Safety, Auditability, and Controllability
Why Stronger Agency Is Not Inherently Dangerous
The authors acknowledge that increased autonomy raises safety concerns—self‑preservation, resource acquisition, goal drift—but argue that GIC’s modular separation of goal, identity, world model, critic and configurator actually improves auditability and intervention.
If dangerous behaviour emerges, developers can pinpoint the faulty component (goal decomposition, identity model, world‑model prediction, critic valuation, or configurator policy) rather than debugging a monolithic black‑box.
Human Supervision Position
Long‑term goals in GIC remain human‑provided; the system has no mechanism to generate its own ultimate objectives, which serves as a safety boundary. Identity evolution, sub‑goal generation and policy learning all serve the externally supplied goal and operate within an auditable structure.
The paper’s stance is not to abandon control but to internalise autonomy in a way that remains observable, constrainable and safely evolvable.
Conclusion
The core contribution of “Critique of Agent Model” is a clearer boundary for the over‑used term “agent”. Many existing systems are powerful agentic tools, but their abilities stem from external workflows and engineering scaffolds. To progress toward truly agentive open‑world systems, goals, identity, decision‑making, self‑regulation and learning must be internalised as trainable, auditable and controllable structures.
GIC offers a concrete route: hierarchical goal decomposition, experience‑driven identity, world‑model‑supported simulation reasoning, a learned configurator for dynamic planning versus reaction, and lifelong learning from real and simulated experience. The paper emphasizes that while tool‑calling and prompt engineering can produce strong agentic behaviour, genuine agency requires internalising these five dimensions while preserving human oversight.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Machine Learning Algorithms & Natural Language Processing
Focused on frontier AI technologies, empowering AI researchers' progress.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
