Designing Agent Memory: Comparative Analysis of Claude, OpenAI Codex CLI, OpenClaw, and Claude Code
This article defines agent memory, outlines its three core components and memory classifications, then provides a detailed comparative analysis of the memory designs in Claude Agent SDK, OpenAI Codex CLI, OpenClaw, and Claude Code, highlighting trade‑offs, implementation details, and engineering implications.
What Is Agent Memory
Definition
Agent memory is a system that enables AI agents to persist, organize, retrieve, and reuse information across time, interactions, and execution contexts. It typically consists of architectural components, control mechanisms, tools, and a software harness, aiming to maintain continuity in both temporal and contextual dimensions even in fragmented interaction scenarios.
Three Core Components
Embedding Model converts unstructured data (text, dialogue, documents) into vector representations for semantic retrieval, allowing agents to match similarity rather than exact keywords. Without it, memory degrades to a keyword database with much lower recall quality.
Database Layer provides persistent storage and vector indexing, enabling cross‑session and cross‑time memory retention. Implementations may use SQLite, Redis, Chroma, Milvus, Faiss, or other databases that now support vectors.
Large Language Model (LLM) acts as the controller that decides what to store, when to retrieve, and how to integrate retrieved context into current reasoning. The quality ceiling of memory is set by the LLM’s judgment; the capacity floor is set by the database layer.
The core of Agent Memory is the Data‑Infrastructure Layer , distinct from the inference layer (LLM) and the tool/action layer. It manages the full lifecycle of memory across three dimensions:
Persistent storage : writes each interaction to durable media so that information survives session termination or process restart.
Efficient retrieval : supports low‑latency, semantically relevant lookup, combining speed and accuracy.
Memory operations : extends CRUD semantics with LLM‑driven capabilities such as adapting to new information, learning from interactions, and maintaining cross‑session consistency.
Memory Classification
Agents often categorize memory by lifespan and function:
Short‑term memory (within a single session):
Semantic Cache – caches recent query results to avoid repeated inference.
Working Memory – temporary workspace for the current task.
Long‑term memory (across sessions):
Procedural – stores skills, workflows, and toolbox state.
Semantic – factual knowledge, often via Entity Memory or a Knowledge Base.
Episodic – logs past interactions, personas, summaries, and conversational continuity.
1. Claude Agent SDK
Design Positioning
Claude Agent SDK does not provide automatic cross‑session memory; each new session starts with a fresh context unless the developer explicitly uses the resume feature. Memory is supplied by two orthogonal mechanisms: session‑level context management and a Memory Tool for cross‑session file persistence.
Session‑Level Memory: Compaction + Context Editing
Compaction summarizes the context window into an abstract, keeping agents efficient as dialogues grow. Tool‑result clearing drops stale tool outputs when they can be recomputed, a cheaper alternative to compression. The three mechanisms complement each other: compaction handles accumulated dialogue, tool‑result clearing handles reusable tool data, and memory persists structured knowledge across sessions.
The instructions parameter of compaction can replace the default summarisation prompt, allowing developers to control which content survives compression and thereby improve quality for specific agents.
Cross‑Session Memory: Memory Tool
The Memory Tool lets Claude create, read, update, and delete persistent files between sessions, enabling knowledge accumulation without keeping everything in the context window. Developers can subclass BetaAbstractMemoryTool (Python) or use betaMemoryTool (TypeScript) to implement custom back‑ends such as file systems, databases, cloud storage, or encrypted files.
For long‑running workflows, compaction and the Memory Tool can be combined: compaction keeps the active context manageable, while the Memory Tool persists important information beyond the summary.
Session Continuity
Claude Agent SDK uses session_id to enable explicit continuity. ResultMessage.session_id supports two modes: resume (continue a full context) and fork (branch from a historical node). Configuration files can store persistence at user, project, and local levels, and manual context files (markdown, plain text) can be loaded at session start.
Design Summary
The SDK follows a "responsibility‑downshift" philosophy: the framework supplies primitives (compaction, context editing, memory tool) while developers implement storage back‑ends and memory strategies, offering flexibility at the cost of added complexity.
2. OpenAI Codex CLI
Design Positioning
Codex CLI treats the repository as the primary source of context. Most context is derived from the codebase itself rather than from dialogue history, shifting context management to repository preparation and configuration files.
Persistent Configuration: AGENTS.md Hierarchy
Codex looks for AGENTS.md files in the repository, similar to README.md, to learn navigation, test commands, and project conventions. A default persistent configuration in the Codex home directory ( ~/.codex/AGENTS.override.md) can globally override settings without deleting the base files.
During discovery, Codex walks upward from the working directory until it finds a directory containing .git, which it treats as the project root. Parameters such as project_doc_max_bytes and project_doc_fallback_filenames control the maximum content read from each AGENTS.md and fallback filenames when a file is missing.
Session History Persistence
By default Codex stores local session transcripts under CODEX_HOME (e.g., ~/.codex/history.jsonl). The history.max_bytes setting caps file size; when exceeded, the oldest entries are discarded and the file is compressed, preserving recent records.
Context Management: /compact and Automatic Compression
After long interactions, the /compact command triggers Codex to summarise early dialogue rounds, replacing them with concise abstracts while retaining key details, thus freeing context space.
For each task, Codex clones the repository into a sandbox, dynamically assembling context from the repository, AGENTS.md, and any persisted project memory.
Cross‑Session Memory
Persisted project memory allows agents to retain project history and context across sessions, eliminating the need to rebuild context each time. Multiple agents can run in parallel, each in its own Git worktree.
Design Summary
Codex CLI’s memory design centres on the repository; AGENTS.md is the core carrier of cross‑session knowledge, session history is stored in JSONL, and context compression is explicitly invoked via /compact.
3. OpenClaw
Design Positioning
OpenClaw follows a "file‑system is memory" principle, persisting memory as plain Markdown files within the agent workspace. The model only "remembers" what is written to disk, providing full transparency, auditability, and version control.
Memory Hierarchy
SOUL.md (Identity Layer) : defines the agent’s personality, communication style, core values, and boundaries.
AGENTS.md (Behavior Layer) : defines operational rules and workflow control.
MEMORY.md (Long‑Term Layer) : stores persistent facts, preferences, and decisions, loaded at the start of each DM session.
daily logs (memory/YYYY‑MM‑DD.md) : records runtime context and observations, automatically loaded for the current and previous day.
OpenClaw automatically loads eight fixed files (SOUL.md, AGENTS.md, USER.md, TOOLS.md, IDENTITY.md, HEARTBEAT.md, BOOTSTRAP.md, MEMORY.md). Files with other names are ignored.
Retrieval Mechanism: Hybrid Search
When an embedding provider is configured, the memory_search command performs hybrid search, combining vector similarity with keyword matching. Supported providers include OpenAI, Gemini, Voyage, and Mistral. Under the hood, a SQLite‑based store handles keyword, vector, and hybrid queries without extra dependencies.
Pre‑Compaction Memory Protection
Before compaction, OpenClaw runs a silent round that prompts the agent to flush important context to memory files. This default‑enabled flush prevents loss of critical information during summarisation.
File Constraints
Each memory file has a character limit (default 20,000 characters) and the aggregate of bootstrap files is capped at 150,000 characters (~50K tokens). Files exceeding limits are truncated and have no effect on the agent, making concise MEMORY.md files an engineering requirement.
Limitations
Semantic search excels at finding similar text but cannot infer relationships between facts. Cross‑project searches may return irrelevant results, motivating community exploration of knowledge‑graph‑style memory.
Design Summary
OpenClaw prioritises transparency and operability: all memory is stored as plain Markdown, the four‑layer hierarchy covers identity to daily logs, and a silent flush before compaction safeguards context.
4. Claude Code
Design Positioning
Claude Code offers the most complete memory implementation among the four frameworks, providing two complementary cross‑session mechanisms: the developer‑controlled CLAUDE.md instruction file and the model‑driven Auto Memory.
CLAUDE.md: Developer‑Controlled Instruction Memory
Claude Code walks up the directory tree to locate CLAUDE.md and CLAUDE.local.md files, concatenating them (with .local appended after the base file). When conflicts arise, the later .local entry wins. The file survives compaction; after executing /compact, Claude rereads CLAUDE.md and reinjects its contents.
Auto Memory: Model‑Driven Autonomous Memory
Auto Memory lets Claude autonomously record notes—commands, debugging insights, architectural explanations, coding style preferences—without developer intervention. It decides whether to store information based on its future usefulness. Each project has an independent memory directory under ~/.claude/projects/<project>/memory/, derived from the Git repository, so all worktrees share the same auto‑memory. MEMORY.md is truncated at 25 KB or 200 lines to prevent unbounded growth.
Five‑Stage Context Compression Pipeline (queryLoop)
Tool Result Budget : enforces size limits on aggregated tool results before micro‑compact, preventing cache interference.
Snip Compact (HISTORY_SNIP gate) : removes old tool‑result pairs, feeding reclaimed tokens back to the auto‑compact threshold.
Microcompact : uses the cache_edits API to delete messages server‑side at zero API cost.
Auto‑compact : triggers when effectiveContextWindowSize - 13000 is exceeded (≈93.5 % utilisation on a 200K model), performing full‑session summarisation.
Session Memory Compact : prunes using pre‑extracted session summaries, avoiding extra LLM calls for compression.
Pre‑Write Transcript and --resume
Source analysis shows Claude Code writes user messages to disk via recordTranscript before entering the query loop. If the process crashes before an API response, the missing transcript would cause getLastSessionLog to return null and --resume to report "No conversation found". Pre‑writing ensures the session can be resumed as soon as the user message is accepted, regardless of API latency.
Design Summary
Claude Code embodies an "active memory management" philosophy: CLAUDE.md provides stable, developer‑controlled instructions, while Auto Memory lets the model decide what to retain. The five‑stage compression pipeline finely controls context consumption, and pre‑write transcript guarantees reliable session recovery.
5. Relationship to the Agent Loop
Agent Memory is the essential substrate that allows the broader Agent Loop to run for extended periods. Within a loop, the LLM context window and working memory form short‑term memory that disappears at session end. Embedding models and persistent databases preserve key information across loops, overcoming the physical limits of a single context window.
Mechanisms such as Claude Code’s recordTranscript, OpenClaw’s pre‑compaction flush, and Claude Agent SDK’s session_id resume all address the same problem: preventing agents from "forgetting" after a loop finishes. Without memory, an agent is a stateless tool; with memory, it becomes a continuously learning autonomous system.
6. Overall Conclusions
Choice of Persistent Formats
OpenClaw and Claude Code both use plain Markdown for persistence; Codex CLI stores session transcripts in JSONL and project instructions in Markdown; Claude Agent SDK leaves format decisions to developers. Markdown’s advantages are human readability, version control friendliness, vendor neutrality, and zero learning cost for LLMs trained on Markdown corpora.
Boundary Between Memory and Context
All four frameworks distinguish between volatile session context (subject to compression loss) and durable memory (written to persistent storage). Engineering manifestations include OpenClaw’s silent flush before compaction, Claude Code’s post‑compaction CLAUDE.md reinjection, and Codex CLI’s per‑session AGENTS.md reload.
Memory Proactivity
Claude Code’s Auto Memory and OpenClaw’s daily logs represent two ends of the proactivity spectrum. Auto Memory relies on the model to decide what to keep, offering higher intelligence but less predictability. OpenClaw encourages explicit user/model collaboration to maintain memory files, providing more deterministic outcomes. Claude Agent SDK and Codex CLI delegate memory decisions entirely to developers or users.
Evolution of Retrieval Mechanisms
Retrieval evolves from Codex CLI’s direct repository file access, through Claude Agent SDK’s pluggable back‑ends, to OpenClaw’s built‑in hybrid search (vector + keyword). As frameworks broaden beyond code‑centric use cases, richer semantic retrieval becomes essential for complex knowledge management.
Nature of Compression
Compression is not mere deletion; it is a hierarchical promotion process that decides which content ascends to persistent memory, which is retained as a summary, and which can be discarded. The quality of this decision directly impacts long‑term knowledge accumulation efficiency and distinguishes the memory designs of the four frameworks.
References
https://platform.claude.com/docs/en/agents-and-tools/tool-use/memory-tool
https://platform.claude.com/cookbook/tool-use-context-engineering-context-engineering-tools
https://code.claude.com/docs/en/memory
https://github.com/anthropics/claude-code/blob/main/CHANGELOG.md
https://openai.com/index/introducing-codex/
https://developers.openai.com/codex/cli
https://developers.openai.com/codex/guides/agents-md
https://docs.openclaw.ai/concepts/memory
https://github.com/openclaw/openclaw/blob/main/docs/concepts/memory.md
https://zread.ai/instructkr/claude-code/
https://github.com/instructkr/claude-code
https://newclawtimes.com/guides/openclaw-memory-soul-md-agents-md-guide/
https://velvetshark.com/openclaw-memory-masterclass
https://blog.dailydoseofds.com/p/openclaws-memory-is-broken-heres
https://gaodalie.substack.com/p/i-studied-openclaw-memory-system
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
AI Engineer Programming
In the AI era, defining problems is often more important than solving them; here we explore AI's contradictions, boundaries, and possibilities.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
