12 Reusable Agentic Harness Patterns: When to Use and Avoid Over‑Design

The article breaks down twelve reusable Agentic Harness design patterns extracted from Claude Code, grouping them into memory, workflow, tool‑permission, and automation categories, explains the architectural pain points each solves, shows when to apply or over‑engineer them, and provides concrete Python implementations.

Data STUDIO
Data STUDIO
Data STUDIO
12 Reusable Agentic Harness Patterns: When to Use and Avoid Over‑Design

Memory & Context – Agent Cortex

The first five patterns answer the question “What should an agent remember, where, and for how long?” The author identifies three mutually conflicting requirements: capacity (how much can be stored), speed (how quickly it can be retrieved), and relevance (whether the retrieved item is useful). Satisfying any two forces the third to collapse, forming an impossible triangle. The five memory patterns resolve this trade‑off.

Pattern 3 – Hierarchical Memory

A three‑layer approach keeps a tiny index permanently in the prompt (≈200 lines), loads hot files on demand, and stores the full history on disk for cold search. The minimal implementation is shown below:

# memory_loader.py — three‑layer memory loader
from pathlib import Path

class MemoryLoader:
    """Three‑layer memory: index always loaded + hot on‑demand + cold search"""
    def __init__(self, memory_dir: Path):
        self.memory_dir = memory_dir
        self.index_path = memory_dir / "MEMORY.md"

    def load_index(self) -> str:
        """Layer 1 – index always loaded, hard limit 200 lines"""
        if not self.index_path.exists():
            return ""
        content = self.index_path.read_text()
        lines = content.split("
")
        if len(lines) > 200:
            raise ValueError(f"MEMORY.md {len(lines)} lines > 200, please trim the index")
        return content

    def load_hot(self, topic: str) -> str:
        """Layer 2 – load up to three matching memory files"""
        matches = list(self.memory_dir.glob(f"*{topic}*.md"))
        if not matches:
            return ""
        return "
---
".join(f.read_text() for f in matches[:3])

    def search_cold(self, query: str) -> str:
        """Layer 3 – full‑text search in cold storage (production uses vector search)"""
        import subprocess
        result = subprocess.run([
            "grep", "-rl", query,
            str(self.memory_dir / "archive")
        ], capture_output=True, text=True)
        return result.stdout

The hard line‑count limit on the index is the key rule; once the index exceeds 200 lines the hierarchical logic collapses and the system falls back to dumping everything into the prompt.

Other Memory Patterns

Pattern 1 – Persistent Instructions : Provide a project‑level “employee handbook”. Use when cross‑session behavior is needed. Over‑design if a single file exceeds 500 lines without splitting.

Pattern 2 – Scoped Context : Automatically load different rule files per directory (useful for monorepos or multi‑language projects). Over‑design for tiny projects where a single CLAUDE.md suffices.

Pattern 4 – Memory Consolidation : Periodically clean duplicate, conflicting, or expired memories. Needed for agents running weeks or more; unnecessary for very young projects.

Pattern 5 – Progressive Compression : Automatically compress old conversation turns. Apply to single sessions longer than 20‑30 turns; avoid for short sessions where compression would lose information.

Workflow & Orchestration – Splitting “Think” and “Do”

Patterns 6‑8 address “How can an agent handle complex tasks without turning the context into a garbage dump?” A long conversation can fill 90 % of the context window with irrelevant history, making it hard for the model to locate useful information.

Pattern 7 – Context‑Isolated Sub‑Agents

Each sub‑agent receives its own independent context window and permissions:

Research sub‑agent – read‑only, produces a report.

Planning sub‑agent – designs a solution, never touches raw data.

Execution sub‑agent – full tool access but only receives the blueprint and research summary.

The implementation uses the subagent_type field to set the role and the isolation flag to create an independent worktree, ensuring that information editing happens in the main agent rather than blindly forwarding raw research.

Other Workflow Patterns

Pattern 6 – Explore‑Plan‑Execute : Read‑only exploration, then planning, then modification. Suitable for unfamiliar codebases or multi‑file changes; over‑design for a simple one‑line config change.

Pattern 8 – Branch‑Merge Parallel : Run independent sub‑tasks in parallel (e.g., batch refactoring multiple modules). Over‑design when sub‑tasks have dependencies that would cause merge conflicts.

Tools & Permissions – Least‑Privilege in Agent Systems

Patterns 9‑11 focus on what operations an agent may perform and how to prevent accidental damage. The author describes “confirmation fatigue” where repeated prompts to confirm dangerous commands become ignored.

Pattern 10 – Command‑Risk Classification

Commands are classified into three deterministic risk levels:

Low (read, status, search): auto‑allow.

Medium (write, script execution, package install): require user confirmation.

High (sudo, rm -rf /, system changes): block outright.

A ~30‑line implementation demonstrates the classification logic:

# command_risk.py — three‑level command risk classification
import shlex
from enum import Enum

class RiskLevel(Enum):
    LOW = "low"      # auto‑allow
    MEDIUM = "medium"  # require confirmation
    HIGH = "high"    # block

HIGH_RISK_PATTERNS = [
    "rm -rf /", "sudo ", "chmod 777", "> /dev/sda", "mkfs.", "dd if=",
    ":(){ :|:& }:", "chown -R /", "mv /* /dev/null"
]
MEDIUM_RISK_PREFIXES = [
    "rm ", "git push", "git reset --hard", "pip install", "npm install -g",
    "brew ", "docker rm", "kubectl delete", "chmod "
]

def classify_command(command: str) -> tuple[RiskLevel, str]:
    """Return (risk level, reason)"""
    cmd = command.strip().lower()
    for pattern in HIGH_RISK_PATTERNS:
        if pattern.lower() in cmd:
            return RiskLevel.HIGH, f"High‑risk pattern matched: {pattern}"
    for prefix in MEDIUM_RISK_PREFIXES:
        if cmd.startswith(prefix.lower()):
            return RiskLevel.MEDIUM, f"Medium‑risk prefix: {prefix}"
    return RiskLevel.LOW, "Low‑risk operation"

The classification must live in deterministic code, not in a prompt, because a mis‑interpreted prompt could mistakenly allow a dangerous command.

Other Tool Patterns

Pattern 9 – Progressive Tool Extension : Start with read‑only tools; enable write/execute on demand. Useful for early‑stage prototypes; over‑design when fewer than five tools exist.

Pattern 11 – Single‑Purpose Tools : Separate Read, Write, and Edit tools; no shared shell. Beneficial when the agent performs frequent file operations; over‑design when fewer than three tools are needed.

Automation – Deterministic Bottom‑Line

Pattern 12 is the only pattern that explicitly avoids LLM involvement. It provides deterministic lifecycle hooks that guarantee essential actions run regardless of the model’s memory.

Pattern 12 – Deterministic Lifecycle Hooks

Four hook points are defined:

PreToolUse : runs before a tool is invoked (e.g., risk classification).

PostToolUse : runs after a tool finishes (e.g., auto‑format with ruff).

SessionStart : loads project‑level CLAUDE.md at the beginning of a session.

Stop : runs final checks before the session ends.

A minimal implementation:

# hook_system.py — deterministic lifecycle hook system
import subprocess, sys
from typing import Callable

class HookManager:
    """Agent lifecycle hooks – any hook returning False or raising aborts further execution"""
    def __init__(self):
        self._pre_tool: list[Callable] = []   # PreToolUse hooks
        self._post_tool: list[Callable] = []  # PostToolUse hooks
        self._stop: list[Callable] = []       # Stop hooks

    def on_pre_tool(self, fn: Callable):
        """Register a PreToolUse hook"""
        self._pre_tool.append(fn)
        return fn

    def on_post_tool(self, fn: Callable):
        """Register a PostToolUse hook"""
        self._post_tool.append(fn)
        return fn

    def on_stop(self, fn: Callable):
        """Register a Stop hook"""
        self._stop.append(fn)
        return fn

    def _run(self, hooks: list[Callable], phase: str) -> bool:
        """Execute a list of hooks; any False or exception blocks the phase"""
        for hook in hooks:
            try:
                if hook() is False:
                    print(f"[HOOK FAIL] {phase}: {hook.__name__} returned False", file=sys.stderr)
                    return False
            except Exception as e:
                print(f"[HOOK ERROR] {phase}: {hook.__name__} → {e}", file=sys.stderr)
                return False
        return True

    def pre_tool_check(self, tool_name: str) -> bool:
        return self._run(self._pre_tool, f"PreToolUse:{tool_name}")

    def post_tool_check(self, tool_name: str) -> bool:
        return self._run(self._post_tool, f"PostToolUse:{tool_name}")

    def stop_check(self) -> bool:
        return self._run(self._stop, "Stop")

# Example usage ----------------------------------------------------------
hooks = HookManager()

@hooks.on_pre_tool
def check_dangerous_commands():
    """PreToolUse: run command‑risk classification (Pattern 10)"""
    print("  ✓ command risk classification passed")
    return True

@hooks.on_post_tool
def auto_format_code():
    """PostToolUse: run ruff formatter after writing code"""
    try:
        subprocess.run(["ruff", "check", "--fix", "."], capture_output=True, timeout=30, check=True)
        print("  ✓ ruff --fix completed")
        return True
    except subprocess.CalledProcessError:
        print("  ✗ ruff failed, fix before continuing", file=sys.stderr)
        return False

@hooks.on_stop
def final_validate():
    """Stop: final validation before session ends"""
    checks = [
        ("ruff", ["ruff", "check", "."]),
        ("pytest", ["pytest", "--tb=short", "-q"]),
    ]
    all_pass = True
    for name, cmd in checks:
        try:
            subprocess.run(cmd, capture_output=True, timeout=60, check=True)
            print(f"  ✓ {name} passed")
        except subprocess.CalledProcessError:
            print(f"  ✗ {name} failed", file=sys.stderr)
            all_pass = False
    return all_pass

Design principles:

Hooks are deterministic – they never invoke the LLM.

Hooks are decoupled from prompts – even if the model forgets a step, the hook still runs.

Failure blocks further actions – returning False or raising aborts the pipeline.

Conclusion

The twelve patterns form a language that addresses four unavoidable architectural problems in agent engineering:

Memory management – capacity vs. speed vs. relevance (Patterns 1‑5).

Task decomposition – preventing context pollution (Patterns 6‑8).

Permission control – deterministic risk gating for commands and tools (Patterns 9‑11).

Reliable automation – deterministic lifecycle actions independent of LLM memory (Pattern 12).

These designs remain relevant regardless of future model improvements because the physical limits of context windows, the inherent danger of shell commands, memory decay, and prompt forgetfulness persist.

References

Kubernetes Patterns – https://k8spatterns.com/

Prompt Patterns – https://promptpatterns.dev/

12 Agentic Harness Patterns from Claude Code – https://generativeprogrammer.com/p/12-agentic-harness-patterns-from

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Design Patternsmemory managementPythonworkflow orchestrationAgentic HarnessTool PermissionsAutomation Hooks
Data STUDIO
Written by

Data STUDIO

Click to receive the "Python Study Handbook"; reply "benefit" in the chat to get it. Data STUDIO focuses on original data science articles, centered on Python, covering machine learning, data analysis, visualization, MySQL and other practical knowledge and project case studies.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.