Demystifying Harness, Scaffold, and Other Tricky AI Agent Terms
This article breaks down the core terminology of AI agents—Model, Scaffold, Harness, Context Engineering, Policy, Tool Use, Skills, Sub‑agents, and the training‑side concepts of RL Environment, Trainer, Rollout, and Reward—explaining their roles, differences, and how they combine to form functional agents.
Model and Its Limits
Model refers to the large language model (LLM) itself—examples include Claude, Qwen, GPT, Kimi, and DeepSeek. A model consumes text prompts and generates text responses, but it lacks cross‑call memory and the ability to execute loops.
From Model to Agent: Scaffold and Harness
To turn a static model into an autonomous agent, two layers are added: Scaffold and Harness. Scaffold is the behavior‑definition layer surrounding the model, providing system prompts, tool descriptions, output formats, and context‑management rules. It determines how the model perceives the world and what actions it can take.
Harness is the execution layer that drives the model, handles tool calls, and decides when the agent should stop. While Scaffold supplies information visible to the model, Harness supplies invisible logic that orchestrates the entire run.
Analogy and Design Considerations
Think of Scaffold as the script and props, and Harness as the director and stage manager that make the performance happen. Good Harness engineering balances autonomy with controllability, preventing agents from drifting off course.
Product‑Level Variations
Some products (e.g., Claude Code) tightly bind a specific Harness to a model, allowing deep optimization. Others (e.g., Antigravity CLI, Hermes Agent) keep the Harness modular so any model can be plugged in, trading off optimization for flexibility. The same model paired with different Harnesses can yield dramatically different user experiences.
Orchestrator and Sub‑Agents
When multiple agents are coordinated, an Orchestrator (or higher‑level controller) manages each agent’s Harness. Sub‑agents are agents invoked to handle sub‑tasks; they have their own model, Scaffold, and Harness, returning results without exposing internal details.
Context Engineering, Memory, and Policy
Context Engineering designs what the agent sees at each step—system prompts, tool descriptions, conversation history, and retrieved knowledge. Short‑term memory lives within the context window; long‑term memory is stored externally and injected when relevant. Policy defines the probability distribution over possible actions in any given situation, influencing both model‑learned behavior and the choices made by Scaffold and Harness.
Tool Use, Skills, and Their Spectrum
Tool Use is how an agent interacts with external resources (APIs, code interpreters, databases, web search, file systems). A Skill packages reusable, structured knowledge to accomplish multi‑step tasks, sitting between simple tools and full sub‑agents on the complexity spectrum.
Training‑Side Concepts
Training mirrors inference but adds reward‑driven updates. An RL Environment provides a stateful interface where actions (often tool calls) produce new observations. The Trainer (e.g., the GRPOTrainer class) runs many agent rollouts, scores them, and updates model weights. A Rollout is a complete trajectory of observations, actions, and rewards; thousands of rollouts are typically needed for convergence.
Reward signals can be verifiable (pass/fail) or learned (human preference), sparse (only at episode end) or dense (per step). Rubrics decompose rewards into weighted dimensions, offering fine‑grained feedback. OpenEnv and Verifiers implement composable rubric objects such as WeightedSum, Sequential, and Gate.
Conclusion
Clarifying these terms—Model, Scaffold, Harness, Context Engineering, Policy, Tool Use, Skills, Sub‑agents, and the training pipeline—removes the fog around AI agents and enables more precise design, evaluation, and discussion.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
SuanNi
A community for AI developers that aggregates large-model development services, models, and compute power.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
