12 Reusable Agentic Harness Patterns: When to Use and Avoid Over‑Design
The article breaks down twelve reusable Agentic Harness design patterns extracted from Claude Code, grouping them into memory, workflow, tool‑permission, and automation categories, explains the architectural pain points each solves, shows when to apply or over‑engineer them, and provides concrete Python implementations.
Memory & Context – Agent Cortex
The first five patterns answer the question “What should an agent remember, where, and for how long?” The author identifies three mutually conflicting requirements: capacity (how much can be stored), speed (how quickly it can be retrieved), and relevance (whether the retrieved item is useful). Satisfying any two forces the third to collapse, forming an impossible triangle. The five memory patterns resolve this trade‑off.
Pattern 3 – Hierarchical Memory
A three‑layer approach keeps a tiny index permanently in the prompt (≈200 lines), loads hot files on demand, and stores the full history on disk for cold search. The minimal implementation is shown below:
# memory_loader.py — three‑layer memory loader
from pathlib import Path
class MemoryLoader:
"""Three‑layer memory: index always loaded + hot on‑demand + cold search"""
def __init__(self, memory_dir: Path):
self.memory_dir = memory_dir
self.index_path = memory_dir / "MEMORY.md"
def load_index(self) -> str:
"""Layer 1 – index always loaded, hard limit 200 lines"""
if not self.index_path.exists():
return ""
content = self.index_path.read_text()
lines = content.split("
")
if len(lines) > 200:
raise ValueError(f"MEMORY.md {len(lines)} lines > 200, please trim the index")
return content
def load_hot(self, topic: str) -> str:
"""Layer 2 – load up to three matching memory files"""
matches = list(self.memory_dir.glob(f"*{topic}*.md"))
if not matches:
return ""
return "
---
".join(f.read_text() for f in matches[:3])
def search_cold(self, query: str) -> str:
"""Layer 3 – full‑text search in cold storage (production uses vector search)"""
import subprocess
result = subprocess.run([
"grep", "-rl", query,
str(self.memory_dir / "archive")
], capture_output=True, text=True)
return result.stdoutThe hard line‑count limit on the index is the key rule; once the index exceeds 200 lines the hierarchical logic collapses and the system falls back to dumping everything into the prompt.
Other Memory Patterns
Pattern 1 – Persistent Instructions : Provide a project‑level “employee handbook”. Use when cross‑session behavior is needed. Over‑design if a single file exceeds 500 lines without splitting.
Pattern 2 – Scoped Context : Automatically load different rule files per directory (useful for monorepos or multi‑language projects). Over‑design for tiny projects where a single CLAUDE.md suffices.
Pattern 4 – Memory Consolidation : Periodically clean duplicate, conflicting, or expired memories. Needed for agents running weeks or more; unnecessary for very young projects.
Pattern 5 – Progressive Compression : Automatically compress old conversation turns. Apply to single sessions longer than 20‑30 turns; avoid for short sessions where compression would lose information.
Workflow & Orchestration – Splitting “Think” and “Do”
Patterns 6‑8 address “How can an agent handle complex tasks without turning the context into a garbage dump?” A long conversation can fill 90 % of the context window with irrelevant history, making it hard for the model to locate useful information.
Pattern 7 – Context‑Isolated Sub‑Agents
Each sub‑agent receives its own independent context window and permissions:
Research sub‑agent – read‑only, produces a report.
Planning sub‑agent – designs a solution, never touches raw data.
Execution sub‑agent – full tool access but only receives the blueprint and research summary.
The implementation uses the subagent_type field to set the role and the isolation flag to create an independent worktree, ensuring that information editing happens in the main agent rather than blindly forwarding raw research.
Other Workflow Patterns
Pattern 6 – Explore‑Plan‑Execute : Read‑only exploration, then planning, then modification. Suitable for unfamiliar codebases or multi‑file changes; over‑design for a simple one‑line config change.
Pattern 8 – Branch‑Merge Parallel : Run independent sub‑tasks in parallel (e.g., batch refactoring multiple modules). Over‑design when sub‑tasks have dependencies that would cause merge conflicts.
Tools & Permissions – Least‑Privilege in Agent Systems
Patterns 9‑11 focus on what operations an agent may perform and how to prevent accidental damage. The author describes “confirmation fatigue” where repeated prompts to confirm dangerous commands become ignored.
Pattern 10 – Command‑Risk Classification
Commands are classified into three deterministic risk levels:
Low (read, status, search): auto‑allow.
Medium (write, script execution, package install): require user confirmation.
High (sudo, rm -rf /, system changes): block outright.
A ~30‑line implementation demonstrates the classification logic:
# command_risk.py — three‑level command risk classification
import shlex
from enum import Enum
class RiskLevel(Enum):
LOW = "low" # auto‑allow
MEDIUM = "medium" # require confirmation
HIGH = "high" # block
HIGH_RISK_PATTERNS = [
"rm -rf /", "sudo ", "chmod 777", "> /dev/sda", "mkfs.", "dd if=",
":(){ :|:& }:", "chown -R /", "mv /* /dev/null"
]
MEDIUM_RISK_PREFIXES = [
"rm ", "git push", "git reset --hard", "pip install", "npm install -g",
"brew ", "docker rm", "kubectl delete", "chmod "
]
def classify_command(command: str) -> tuple[RiskLevel, str]:
"""Return (risk level, reason)"""
cmd = command.strip().lower()
for pattern in HIGH_RISK_PATTERNS:
if pattern.lower() in cmd:
return RiskLevel.HIGH, f"High‑risk pattern matched: {pattern}"
for prefix in MEDIUM_RISK_PREFIXES:
if cmd.startswith(prefix.lower()):
return RiskLevel.MEDIUM, f"Medium‑risk prefix: {prefix}"
return RiskLevel.LOW, "Low‑risk operation"The classification must live in deterministic code, not in a prompt, because a mis‑interpreted prompt could mistakenly allow a dangerous command.
Other Tool Patterns
Pattern 9 – Progressive Tool Extension : Start with read‑only tools; enable write/execute on demand. Useful for early‑stage prototypes; over‑design when fewer than five tools exist.
Pattern 11 – Single‑Purpose Tools : Separate Read, Write, and Edit tools; no shared shell. Beneficial when the agent performs frequent file operations; over‑design when fewer than three tools are needed.
Automation – Deterministic Bottom‑Line
Pattern 12 is the only pattern that explicitly avoids LLM involvement. It provides deterministic lifecycle hooks that guarantee essential actions run regardless of the model’s memory.
Pattern 12 – Deterministic Lifecycle Hooks
Four hook points are defined:
PreToolUse : runs before a tool is invoked (e.g., risk classification).
PostToolUse : runs after a tool finishes (e.g., auto‑format with ruff).
SessionStart : loads project‑level CLAUDE.md at the beginning of a session.
Stop : runs final checks before the session ends.
A minimal implementation:
# hook_system.py — deterministic lifecycle hook system
import subprocess, sys
from typing import Callable
class HookManager:
"""Agent lifecycle hooks – any hook returning False or raising aborts further execution"""
def __init__(self):
self._pre_tool: list[Callable] = [] # PreToolUse hooks
self._post_tool: list[Callable] = [] # PostToolUse hooks
self._stop: list[Callable] = [] # Stop hooks
def on_pre_tool(self, fn: Callable):
"""Register a PreToolUse hook"""
self._pre_tool.append(fn)
return fn
def on_post_tool(self, fn: Callable):
"""Register a PostToolUse hook"""
self._post_tool.append(fn)
return fn
def on_stop(self, fn: Callable):
"""Register a Stop hook"""
self._stop.append(fn)
return fn
def _run(self, hooks: list[Callable], phase: str) -> bool:
"""Execute a list of hooks; any False or exception blocks the phase"""
for hook in hooks:
try:
if hook() is False:
print(f"[HOOK FAIL] {phase}: {hook.__name__} returned False", file=sys.stderr)
return False
except Exception as e:
print(f"[HOOK ERROR] {phase}: {hook.__name__} → {e}", file=sys.stderr)
return False
return True
def pre_tool_check(self, tool_name: str) -> bool:
return self._run(self._pre_tool, f"PreToolUse:{tool_name}")
def post_tool_check(self, tool_name: str) -> bool:
return self._run(self._post_tool, f"PostToolUse:{tool_name}")
def stop_check(self) -> bool:
return self._run(self._stop, "Stop")
# Example usage ----------------------------------------------------------
hooks = HookManager()
@hooks.on_pre_tool
def check_dangerous_commands():
"""PreToolUse: run command‑risk classification (Pattern 10)"""
print(" ✓ command risk classification passed")
return True
@hooks.on_post_tool
def auto_format_code():
"""PostToolUse: run ruff formatter after writing code"""
try:
subprocess.run(["ruff", "check", "--fix", "."], capture_output=True, timeout=30, check=True)
print(" ✓ ruff --fix completed")
return True
except subprocess.CalledProcessError:
print(" ✗ ruff failed, fix before continuing", file=sys.stderr)
return False
@hooks.on_stop
def final_validate():
"""Stop: final validation before session ends"""
checks = [
("ruff", ["ruff", "check", "."]),
("pytest", ["pytest", "--tb=short", "-q"]),
]
all_pass = True
for name, cmd in checks:
try:
subprocess.run(cmd, capture_output=True, timeout=60, check=True)
print(f" ✓ {name} passed")
except subprocess.CalledProcessError:
print(f" ✗ {name} failed", file=sys.stderr)
all_pass = False
return all_passDesign principles:
Hooks are deterministic – they never invoke the LLM.
Hooks are decoupled from prompts – even if the model forgets a step, the hook still runs.
Failure blocks further actions – returning False or raising aborts the pipeline.
Conclusion
The twelve patterns form a language that addresses four unavoidable architectural problems in agent engineering:
Memory management – capacity vs. speed vs. relevance (Patterns 1‑5).
Task decomposition – preventing context pollution (Patterns 6‑8).
Permission control – deterministic risk gating for commands and tools (Patterns 9‑11).
Reliable automation – deterministic lifecycle actions independent of LLM memory (Pattern 12).
These designs remain relevant regardless of future model improvements because the physical limits of context windows, the inherent danger of shell commands, memory decay, and prompt forgetfulness persist.
References
Kubernetes Patterns – https://k8spatterns.com/
Prompt Patterns – https://promptpatterns.dev/
12 Agentic Harness Patterns from Claude Code – https://generativeprogrammer.com/p/12-agentic-harness-patterns-from
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Data STUDIO
Click to receive the "Python Study Handbook"; reply "benefit" in the chat to get it. Data STUDIO focuses on original data science articles, centered on Python, covering machine learning, data analysis, visualization, MySQL and other practical knowledge and project case studies.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
