Artificial Intelligence 15 min read

Progressive Disclosure & Dynamic Context: Making LLM Agents Reliable Execution Systems

This article explains how progressive disclosure and dynamic context management address the three core bottlenecks of complex LLM agents—context explosion, tool overload, and uncontrolled execution—by structuring context, tools, and SOPs into layered, token‑efficient, and verifiable workflows.

Architecture and Beyond

Jan 17, 2026

Progressive Disclosure & Dynamic Context: Making LLM Agents Reliable Execution Systems

Why Agent Architecture Matters

When an LLM‑based agent reaches a certain complexity, the challenge shifts from model capability to how context, tools, and processes are supplied and controlled. The same model can become a stable execution system for one team and an occasional chat bot for another, depending on the architecture.

Progressive Disclosure: Contract & Expand

Contract (收) : Gradual exposure, hide irrelevant information, reduce token consumption, focus attention (solves accuracy). Expand (放) : Dynamically inject external topics, memory fragments, or world‑view settings based on interaction state (solves richness and continuity).

The strategy is to use limited tokens to manage unlimited information and let a nondeterministic model execute standardized processes.

Three Typical Bottlenecks

Context Explosion : Documents, code, dialogue history, user profiles, task state cannot all fit into a prompt; over‑filling leads to “Lost in the Middle”.

Tool Overload : More tools mean longer definitions and higher token cost; the probability of selecting the correct tool drops, especially with similar tools.

Uncontrolled Execution : When following an SOP, the model may skip steps, repeat steps, or fabricate results to “smooth” the conversation.

Progressive disclosure solves these by never giving the model the whole world at once; it only sees what it needs at each decision point.

2. Progressive Disclosure Mechanics

It is not about giving less information, but about giving it in phases: decision → feedback → re‑decision. Each step receives the minimal relevant information, reducing noise and improving controllability.

Do not build a "full context".

Maintain a "growing context" with controlled expansion.

The two alternating actions are:

Contract (收缩) : hide, trim, summarize, replace with index.

Expand (扩张) : load fragments, tool subsets, memory, world‑view, process state on demand.

3. Data Layer

Naïve RAG often concatenates retrieved content directly into the prompt, leading to token bloat and diluted attention.

Progressive disclosure at the data layer treats information acquisition as a sequence of actions rather than a single massive pull.

3.1 Disclosure Levels (L0–L3)

L0: Task & Constraints – user intent, output format, prohibitions, success criteria. Stable and short.

L1: Evidence Index – file list, chapter titles, table names, log summaries, search result titles. Provides only "where".

L2: Evidence Fragments – relevant paragraphs, code snippets, table schemas, key log intervals. Provides only "what part".

L3: Full Evidence – entire documents, full conversation history. Used sparingly.

The system first uses L1 for locating, then L2 for judging, and only opens L3 when a full read‑through is required, saving tokens and limiting model hallucination.

3.2 Dynamic Injection

Common mistake: after answering query A, the model appends B’s context, causing uncontrolled growth. Instead, enforce a token budget per round and define zones:

Hard token limit for each injection.

Resident zone – long‑term stable info (user identity, preferences).

Workspace – evidence fragments for the current step.

Cold storage – old evidence kept as index or summary.

When pruning, discard full old evidence, not task state, to avoid losing progress.

4. Tool Layer

More tools increase hesitation and wrong selections. Progressive disclosure applies hierarchical routing and visibility control.

Root layer exposes only five broad categories: 代码类, 文档类, 部署类, 数据库类, 通知类.

Model first selects a category (e.g., "数据库类").

In the next round, specific tools like sql_query or get_table_schema are disclosed.

Benefits:

Direct token control – schema definitions are spread over multiple rounds.

Higher tool‑selection accuracy – fewer options, less confusion among similar tools.

Better security – disallowed capabilities are simply invisible, eliminating the need for repeated warnings.

Tool visibility functions as a permission system: invisible tools cannot be misused, visible‑but‑unusable tools cause wasted attempts, and usable tools may have conditional activation tied to SOP state.

5. SOP Layer

SOPs (Skills) embed process logic into the disclosure mechanism rather than static prompts. The system locks the next step until the previous step’s tool returns a verifiable receipt.

Stage 1 (Lint): expose only lint tools and the current diff; hide commit tools.

Stage 2 (Test): after lint succeeds, expose test tools.

Stage 3 (Commit): only after tests pass, expose git_commit.

This prevents “talk‑but‑no‑action” scenarios: the model may claim success, but the state machine only proceeds on actual tool receipts.

5.1 SOP Control Points

When a tool returns a receipt, trust the receipt.

When no receipt, require human confirmation or external system status.

Never use unverifiable conditions as release gates.

6. Agent Skill as Engineering Abstraction

Skill acts as a container for context, tools, and knowledge needed for a specific task, loaded on demand and unloaded when not needed, keeping the prompt clean.

Load when required.

Unload when not needed.

Maintain a tidy context.

Skills make dynamic injection boundaries explicit: activation of a skill injects only the minimal information needed, enabling budgeting, auditing, and replay.

Routing under a skill system outputs which skills to activate instead of expanding the prompt, allowing metrics such as trigger rate, success rate, and average token usage per skill.

7. Dynamic Context Management

Context should be treated as a projection of system state rather than a concatenated string.

Four object types with distinct lifecycles:

Task State – current phase, completed checkpoints, allowed next actions; short, stable, structured.

Evidence – retrieved fragments, tool outputs, external info; referenceable, traceable, evictable.

Preferences & Long‑Term Memory – influences style or strategy; changes infrequently and under control.

Capabilities & Permissions – tool visibility, usability, process release conditions; act as constraints.

8. Executable Architecture Checklist

Control tool visibility first – expose only root categories, then branch‑level tools.

Turn SOPs into state‑machine gates – next tool appears only after a successful receipt.

Partition context into resident / workspace / cold‑store zones.

Adopt index‑then‑fragment disclosure – always give directory or title before full content.

Skill‑ify context‑tool combos – start with 5‑10 high‑frequency skills, ensure they are stable.

Record observations each round – which evidence was disclosed, which tools were visible, which skills fired, token usage, checkpoint hits.

9. Summary

Information is disclosed step‑by‑step, not dumped all at once.

Tools are revealed hierarchically, not fully open.

Processes are enforced by a state machine, not by model self‑assessment.

Memory is writable, evictable, and traceable, not endlessly growing.

When these four principles are applied, an agent behaves like a reliable execution system: it asks clear questions, retrieves evidence, follows verified steps, and stops when it cannot proceed, rather than fabricating responses.

SOP AI engineering LLM agents Progressive Disclosure dynamic context

Written by

Architecture and Beyond

Focused on AIGC SaaS technical architecture and tech team management, sharing insights on architecture, development efficiency, team leadership, startup technology choices, large‑scale website design, and high‑performance, highly‑available, scalable solutions.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.