Sub‑Agent Delegation: Turning Complex Tasks into Parallel Sub‑Tasks

The article explains how Hermes' sub‑agent delegation transforms a serial, context‑heavy workflow—such as researching multiple vector databases—into parallel, isolated sub‑tasks, detailing three‑layer isolation, orchestrator role, heartbeat monitoring, approval safety, credential handling, and compares industry approaches.

James' Growth Diary
James' Growth Diary
James' Growth Diary
Sub‑Agent Delegation: Turning Complex Tasks into Parallel Sub‑Tasks

Why Sub‑Agent Delegation Is Needed: From Serial to Parallel Factory

When an Agent is asked to research the top‑5 vector databases, a single Agent would query each database sequentially, inflating context and multiplying total time. By delegating each database research to a separate sub‑Agent, the total duration becomes roughly the slowest sub‑Agent, achieving parallelism while keeping each sub‑Agent’s context small.

"Research the top‑5 vector databases, compare performance, price, ecosystem, and write a selection report."

The core value of sub‑Agent delegation is converting a serial process into parallel execution with isolated contexts.

Context Isolation : each sub‑Agent runs in its own AIAgent instance with a separate session and task_id.

Toolset Restriction : dangerous tools are black‑listed via DELEGATE_BLOCKED_TOOLS (e.g., delegate_task, clarify, memory, send_message, execute_code).

Depth Control : recursion is limited by max_spawn_depth (default 1, flat); deeper trees require role="orchestrator" and an increased depth setting.

delegate_task Core Design: Two Modes + Three‑Layer Isolation

The entry function delegate_task supports:

Single‑Task Mode

delegate_task(
    goal="Research Milvus performance",
    context="User needs comparison of 5 vector DBs",
    toolsets=["web", "file"],
    role="leaf"  # default, sub‑Agent cannot further delegate
)

Batch‑Task Mode

delegate_task(
    tasks=[
        {"goal": "Research Milvus", "toolsets": ["web"]},
        {"goal": "Research Qdrant", "toolsets": ["web"]},
        {"goal": "Research Pinecone", "toolsets": ["web"]}
    ],
    role="leaf"
)

Batch mode uses a ThreadPoolExecutor with concurrency limited by delegation.max_concurrent_children (default 3).

Three‑Layer Isolation Mechanism

Layer 1 – Context Isolation : each sub‑Agent gets a fresh AIAgent instance, inheriting model and provider but not the parent’s message history.

# _build_child_agent core logic
child = AIAgent(
    model=parent.model,
    provider=parent.provider,
    api_key=parent.api_key,
    base_url=parent.base_url,
    system_prompt=child_system_prompt,  # task‑focused prompt
    tools=child_tools,                # restricted toolset
    _delegate_depth=parent_depth + 1,
)

Layer 2 – Toolset Isolation : the helper _strip_blocked_tools removes black‑listed tools.

DELEGATE_BLOCKED_TOOLS = frozenset([
    "delegate_task",   # forbid recursive delegation unless role="orchestrator"
    "clarify",        # forbid user clarification
    "memory",         # forbid writing shared MEMORY.md
    "send_message",   # forbid cross‑platform messaging
    "execute_code"    # forbid direct script execution
])

Layer 3 – Depth Control : _delegate_depth tracks nesting; default max_spawn_depth=1. When role="orchestrator" and depth permits, sub‑Agents may use delegate_task to form a tree.

Orchestrator Role: From Flat to Tree Delegation

Flat delegation (all sub‑Agents are leaves) suits most cases. For complex tasks, a hierarchical decomposition is needed:

Parent Agent (depth 0)
└── Orchestrator (depth 1, role="orchestrator")
    ├── Worker A (depth 2, leaf)
    ├── Worker B (depth 2, leaf)
    └── Worker C (depth 2, leaf)

The orchestrator’s prompt is generated dynamically, embedding a "Sub‑Agent Generation Guide" that tells the LLM when delegation is appropriate and when to act directly, thus preventing hallucination.

Parallel Execution & Heartbeat: What If a Sub‑Agent Goes Rogue

Sub‑Agents run in a ThreadPoolExecutor. The parent manages each sub‑Agent’s lifecycle via _run_single_child.

Timeout Control

DEFAULT_CHILD_TIMEOUT = 600  # 10 minutes
# Configurable via delegation.child_timeout_seconds (minimum 30 s)

Heartbeat Mechanism

Every 30 seconds the parent sends a heartbeat. Stale detection distinguishes idle waiting (15 cycles = 450 s) from long‑running tool execution (40 cycles = 1200 s).

_HEARTBEAT_INTERVAL = 30  # seconds
_HEARTBEAT_STALE_CYCLES_IDLE = 15
_HEARTBEAT_STALE_CYCLES_IN_TOOL = 40

Stale sub‑Agents are marked as stale and can be interrupted.

Observability

Runtime events are emitted as DelegateEvent enums (e.g., TASK_SPAWNED, TASK_PROGRESS, TASK_COMPLETED, TASK_FAILED, TASK_TOOL_STARTED, TASK_TOOL_COMPLETED) and displayed in the parent’s TUI.

Approval Safety: Auto‑Deny vs Auto‑Approve

Because sub‑Agents run in worker threads, interactive input() calls would deadlock with the parent’s TUI. Hermes replaces them with non‑interactive callbacks:

def _subagent_auto_deny(command, description, **kwargs):
    """Automatically deny dangerous commands (secure default)"""
    logger.warning("Sub Agent auto‑denied dangerous command: %s (%s)", command, description)
    return "deny"

def _subagent_auto_approve(command, description, **kwargs):
    """Automatically approve (YOLO mode)"""
    logger.warning("Sub Agent auto‑approved dangerous command: %s (%s)", command, description)
    return "once"

The flag delegation.subagent_auto_approve toggles the mode (default false). Both modes log audit entries.

Credential Inheritance & Multi‑Provider Fallback

By default a sub‑Agent inherits the parent’s model and credentials. Hermes also allows an independent credential block:

delegation:
  provider: "openrouter"
  model: "anthropic/claude-sonnet-4"
  api_key: "sk-xxx"
  base_url: "https://openrouter.ai/api/v1"

When delegation.provider is set, _resolve_delegation_credentials resolves a separate credential set, enabling cost‑optimized or load‑balanced provider usage.

Output Extraction & Result Aggregation

After a sub‑Agent finishes, the parent receives only the final reply and a tail summary of the last 12 tool calls (max 8000 chars), keeping the parent’s context compact.

def _extract_output_tail(result, max_entries=12, max_chars=8000):
    """Extract the last N tool‑call results from a sub‑Agent’s dialogue"""
    messages = result.get("messages", [])
    # Build tool_call_id → tool_name map, walk backwards, collect previews

This design prevents context blow‑up, correctly flags errors, and limits preview length.

Runtime Controls: Pause & Interrupt

def set_spawn_paused(paused: bool) -> bool:
    """Globally pause or resume creation of new delegations"""

def interrupt_subagent(subagent_id: str) -> bool:
    """Request interruption of a running sub‑Agent"""
    agent = _active_subagents.get(subagent_id)
    agent.interrupt("Interrupted via TUI")

Operators can pause new delegations via delegation.pause or interrupt specific sub‑Agents; interruptions propagate to any nested sub‑Agents.

Industry Practices: How Mature Agents Handle Sub‑Task Delegation

Claude Code / Codex CLI : single main Agent, minimal explicit sub‑Agents, relies on tool calls; prioritises continuous user‑Agent interaction.

LangGraph / AutoGen : explicit multi‑Agent orchestration with Supervisor/Planner and specialized roles; emphasizes observability, testability, and replayability.

Cursor / Devin (engineering coding agents) : treats each task as an isolated sandboxed unit with budgets, timeouts, logs, and approval boundaries.

Enterprise workflow platforms : enforce templates, permission constraints, and mandatory approvals; focus on safety and auditability over raw parallelism.

Hermes : combines parallel efficiency with safety via three‑layer isolation, dynamic orchestrator prompts, heartbeat monitoring, and configurable approval callbacks.

Key takeaways from the comparison:

Whether sub‑Agents are exposed depends on the product’s audience (end‑users vs system developers).

Any task with side effects demands stronger isolation to avoid accident amplification.

Mature solutions bundle task splitting with budgeting, permission checks, logging, and graceful termination.

Conclusion

Sub‑Agent delegation’s core value is parallelism plus isolation.

Three‑layer isolation: independent AIAgent, tool blacklist, and depth limit.

Orchestrator role enables tree‑structured delegation with dynamic prompts.

Heartbeat, timeout, and event streams keep sub‑Agents observable and prevent runaway execution.

Non‑interactive approval callbacks avoid deadlocks while providing audit logs.

Separate credential configuration allows cost‑effective or load‑balanced provider usage.

Industry patterns converge on splitting, isolating, budgeting, approving, and reclaiming sub‑tasks.

Diagram of serial vs parallel delegation
Diagram of serial vs parallel delegation
Three‑layer isolation architecture
Three‑layer isolation architecture
Orchestrator tree delegation
Orchestrator tree delegation
Heartbeat and stale detection
Heartbeat and stale detection
Industry comparison
Industry comparison
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AI agentsHermesparallel executionOrchestratorContext IsolationSubagent Delegation
James' Growth Diary
Written by

James' Growth Diary

I am James, focusing on AI Agent learning and growth. I continuously update two series: “AI Agent Mastery Path,” which systematically outlines core theories and practices of agents, and “Claude Code Design Philosophy,” which deeply analyzes the design thinking behind top AI tools. Helping you build a solid foundation in the AI era.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.