Designing Agent Memory: Comparative Analysis of Claude, OpenAI Codex CLI, OpenClaw, and Claude Code

This article defines agent memory, outlines its three core components and memory classifications, then provides a detailed comparative analysis of the memory designs in Claude Agent SDK, OpenAI Codex CLI, OpenClaw, and Claude Code, highlighting trade‑offs, implementation details, and engineering implications.

AI Engineer Programming
AI Engineer Programming
AI Engineer Programming
Designing Agent Memory: Comparative Analysis of Claude, OpenAI Codex CLI, OpenClaw, and Claude Code

What Is Agent Memory

Definition

Agent memory is a system that enables AI agents to persist, organize, retrieve, and reuse information across time, interactions, and execution contexts. It typically consists of architectural components, control mechanisms, tools, and a software harness, aiming to maintain continuity in both temporal and contextual dimensions even in fragmented interaction scenarios.

Three Core Components

Embedding Model converts unstructured data (text, dialogue, documents) into vector representations for semantic retrieval, allowing agents to match similarity rather than exact keywords. Without it, memory degrades to a keyword database with much lower recall quality.

Database Layer provides persistent storage and vector indexing, enabling cross‑session and cross‑time memory retention. Implementations may use SQLite, Redis, Chroma, Milvus, Faiss, or other databases that now support vectors.

Large Language Model (LLM) acts as the controller that decides what to store, when to retrieve, and how to integrate retrieved context into current reasoning. The quality ceiling of memory is set by the LLM’s judgment; the capacity floor is set by the database layer.

The core of Agent Memory is the Data‑Infrastructure Layer , distinct from the inference layer (LLM) and the tool/action layer. It manages the full lifecycle of memory across three dimensions:

Persistent storage : writes each interaction to durable media so that information survives session termination or process restart.

Efficient retrieval : supports low‑latency, semantically relevant lookup, combining speed and accuracy.

Memory operations : extends CRUD semantics with LLM‑driven capabilities such as adapting to new information, learning from interactions, and maintaining cross‑session consistency.

Memory Classification

Agents often categorize memory by lifespan and function:

Short‑term memory (within a single session):

Semantic Cache – caches recent query results to avoid repeated inference.

Working Memory – temporary workspace for the current task.

Long‑term memory (across sessions):

Procedural – stores skills, workflows, and toolbox state.

Semantic – factual knowledge, often via Entity Memory or a Knowledge Base.

Episodic – logs past interactions, personas, summaries, and conversational continuity.

1. Claude Agent SDK

Design Positioning

Claude Agent SDK does not provide automatic cross‑session memory; each new session starts with a fresh context unless the developer explicitly uses the resume feature. Memory is supplied by two orthogonal mechanisms: session‑level context management and a Memory Tool for cross‑session file persistence.

Session‑Level Memory: Compaction + Context Editing

Compaction summarizes the context window into an abstract, keeping agents efficient as dialogues grow. Tool‑result clearing drops stale tool outputs when they can be recomputed, a cheaper alternative to compression. The three mechanisms complement each other: compaction handles accumulated dialogue, tool‑result clearing handles reusable tool data, and memory persists structured knowledge across sessions.

The instructions parameter of compaction can replace the default summarisation prompt, allowing developers to control which content survives compression and thereby improve quality for specific agents.

Cross‑Session Memory: Memory Tool

The Memory Tool lets Claude create, read, update, and delete persistent files between sessions, enabling knowledge accumulation without keeping everything in the context window. Developers can subclass BetaAbstractMemoryTool (Python) or use betaMemoryTool (TypeScript) to implement custom back‑ends such as file systems, databases, cloud storage, or encrypted files.

For long‑running workflows, compaction and the Memory Tool can be combined: compaction keeps the active context manageable, while the Memory Tool persists important information beyond the summary.

Session Continuity

Claude Agent SDK uses session_id to enable explicit continuity. ResultMessage.session_id supports two modes: resume (continue a full context) and fork (branch from a historical node). Configuration files can store persistence at user, project, and local levels, and manual context files (markdown, plain text) can be loaded at session start.

Design Summary

The SDK follows a "responsibility‑downshift" philosophy: the framework supplies primitives (compaction, context editing, memory tool) while developers implement storage back‑ends and memory strategies, offering flexibility at the cost of added complexity.

2. OpenAI Codex CLI

Design Positioning

Codex CLI treats the repository as the primary source of context. Most context is derived from the codebase itself rather than from dialogue history, shifting context management to repository preparation and configuration files.

Persistent Configuration: AGENTS.md Hierarchy

Codex looks for AGENTS.md files in the repository, similar to README.md, to learn navigation, test commands, and project conventions. A default persistent configuration in the Codex home directory ( ~/.codex/AGENTS.override.md) can globally override settings without deleting the base files.

During discovery, Codex walks upward from the working directory until it finds a directory containing .git, which it treats as the project root. Parameters such as project_doc_max_bytes and project_doc_fallback_filenames control the maximum content read from each AGENTS.md and fallback filenames when a file is missing.

Session History Persistence

By default Codex stores local session transcripts under CODEX_HOME (e.g., ~/.codex/history.jsonl). The history.max_bytes setting caps file size; when exceeded, the oldest entries are discarded and the file is compressed, preserving recent records.

Context Management: /compact and Automatic Compression

After long interactions, the /compact command triggers Codex to summarise early dialogue rounds, replacing them with concise abstracts while retaining key details, thus freeing context space.

For each task, Codex clones the repository into a sandbox, dynamically assembling context from the repository, AGENTS.md, and any persisted project memory.

Cross‑Session Memory

Persisted project memory allows agents to retain project history and context across sessions, eliminating the need to rebuild context each time. Multiple agents can run in parallel, each in its own Git worktree.

Design Summary

Codex CLI’s memory design centres on the repository; AGENTS.md is the core carrier of cross‑session knowledge, session history is stored in JSONL, and context compression is explicitly invoked via /compact.

3. OpenClaw

Design Positioning

OpenClaw follows a "file‑system is memory" principle, persisting memory as plain Markdown files within the agent workspace. The model only "remembers" what is written to disk, providing full transparency, auditability, and version control.

Memory Hierarchy

SOUL.md (Identity Layer) : defines the agent’s personality, communication style, core values, and boundaries.

AGENTS.md (Behavior Layer) : defines operational rules and workflow control.

MEMORY.md (Long‑Term Layer) : stores persistent facts, preferences, and decisions, loaded at the start of each DM session.

daily logs (memory/YYYY‑MM‑DD.md) : records runtime context and observations, automatically loaded for the current and previous day.

OpenClaw automatically loads eight fixed files (SOUL.md, AGENTS.md, USER.md, TOOLS.md, IDENTITY.md, HEARTBEAT.md, BOOTSTRAP.md, MEMORY.md). Files with other names are ignored.

Retrieval Mechanism: Hybrid Search

When an embedding provider is configured, the memory_search command performs hybrid search, combining vector similarity with keyword matching. Supported providers include OpenAI, Gemini, Voyage, and Mistral. Under the hood, a SQLite‑based store handles keyword, vector, and hybrid queries without extra dependencies.

Pre‑Compaction Memory Protection

Before compaction, OpenClaw runs a silent round that prompts the agent to flush important context to memory files. This default‑enabled flush prevents loss of critical information during summarisation.

File Constraints

Each memory file has a character limit (default 20,000 characters) and the aggregate of bootstrap files is capped at 150,000 characters (~50K tokens). Files exceeding limits are truncated and have no effect on the agent, making concise MEMORY.md files an engineering requirement.

Limitations

Semantic search excels at finding similar text but cannot infer relationships between facts. Cross‑project searches may return irrelevant results, motivating community exploration of knowledge‑graph‑style memory.

Design Summary

OpenClaw prioritises transparency and operability: all memory is stored as plain Markdown, the four‑layer hierarchy covers identity to daily logs, and a silent flush before compaction safeguards context.

4. Claude Code

Design Positioning

Claude Code offers the most complete memory implementation among the four frameworks, providing two complementary cross‑session mechanisms: the developer‑controlled CLAUDE.md instruction file and the model‑driven Auto Memory.

CLAUDE.md: Developer‑Controlled Instruction Memory

Claude Code walks up the directory tree to locate CLAUDE.md and CLAUDE.local.md files, concatenating them (with .local appended after the base file). When conflicts arise, the later .local entry wins. The file survives compaction; after executing /compact, Claude rereads CLAUDE.md and reinjects its contents.

Auto Memory: Model‑Driven Autonomous Memory

Auto Memory lets Claude autonomously record notes—commands, debugging insights, architectural explanations, coding style preferences—without developer intervention. It decides whether to store information based on its future usefulness. Each project has an independent memory directory under ~/.claude/projects/<project>/memory/, derived from the Git repository, so all worktrees share the same auto‑memory. MEMORY.md is truncated at 25 KB or 200 lines to prevent unbounded growth.

Five‑Stage Context Compression Pipeline (queryLoop)

Tool Result Budget : enforces size limits on aggregated tool results before micro‑compact, preventing cache interference.

Snip Compact (HISTORY_SNIP gate) : removes old tool‑result pairs, feeding reclaimed tokens back to the auto‑compact threshold.

Microcompact : uses the cache_edits API to delete messages server‑side at zero API cost.

Auto‑compact : triggers when effectiveContextWindowSize - 13000 is exceeded (≈93.5 % utilisation on a 200K model), performing full‑session summarisation.

Session Memory Compact : prunes using pre‑extracted session summaries, avoiding extra LLM calls for compression.

Pre‑Write Transcript and --resume

Source analysis shows Claude Code writes user messages to disk via recordTranscript before entering the query loop. If the process crashes before an API response, the missing transcript would cause getLastSessionLog to return null and --resume to report "No conversation found". Pre‑writing ensures the session can be resumed as soon as the user message is accepted, regardless of API latency.

Design Summary

Claude Code embodies an "active memory management" philosophy: CLAUDE.md provides stable, developer‑controlled instructions, while Auto Memory lets the model decide what to retain. The five‑stage compression pipeline finely controls context consumption, and pre‑write transcript guarantees reliable session recovery.

5. Relationship to the Agent Loop

Agent Memory is the essential substrate that allows the broader Agent Loop to run for extended periods. Within a loop, the LLM context window and working memory form short‑term memory that disappears at session end. Embedding models and persistent databases preserve key information across loops, overcoming the physical limits of a single context window.

Mechanisms such as Claude Code’s recordTranscript, OpenClaw’s pre‑compaction flush, and Claude Agent SDK’s session_id resume all address the same problem: preventing agents from "forgetting" after a loop finishes. Without memory, an agent is a stateless tool; with memory, it becomes a continuously learning autonomous system.

6. Overall Conclusions

Choice of Persistent Formats

OpenClaw and Claude Code both use plain Markdown for persistence; Codex CLI stores session transcripts in JSONL and project instructions in Markdown; Claude Agent SDK leaves format decisions to developers. Markdown’s advantages are human readability, version control friendliness, vendor neutrality, and zero learning cost for LLMs trained on Markdown corpora.

Boundary Between Memory and Context

All four frameworks distinguish between volatile session context (subject to compression loss) and durable memory (written to persistent storage). Engineering manifestations include OpenClaw’s silent flush before compaction, Claude Code’s post‑compaction CLAUDE.md reinjection, and Codex CLI’s per‑session AGENTS.md reload.

Memory Proactivity

Claude Code’s Auto Memory and OpenClaw’s daily logs represent two ends of the proactivity spectrum. Auto Memory relies on the model to decide what to keep, offering higher intelligence but less predictability. OpenClaw encourages explicit user/model collaboration to maintain memory files, providing more deterministic outcomes. Claude Agent SDK and Codex CLI delegate memory decisions entirely to developers or users.

Evolution of Retrieval Mechanisms

Retrieval evolves from Codex CLI’s direct repository file access, through Claude Agent SDK’s pluggable back‑ends, to OpenClaw’s built‑in hybrid search (vector + keyword). As frameworks broaden beyond code‑centric use cases, richer semantic retrieval becomes essential for complex knowledge management.

Nature of Compression

Compression is not mere deletion; it is a hierarchical promotion process that decides which content ascends to persistent memory, which is retained as a summary, and which can be discarded. The quality of this decision directly impacts long‑term knowledge accumulation efficiency and distinguishes the memory designs of the four frameworks.

References

https://platform.claude.com/docs/en/agents-and-tools/tool-use/memory-tool

https://platform.claude.com/cookbook/tool-use-context-engineering-context-engineering-tools

https://code.claude.com/docs/en/memory

https://github.com/anthropics/claude-code/blob/main/CHANGELOG.md

https://openai.com/index/introducing-codex/

https://developers.openai.com/codex/cli

https://developers.openai.com/codex/guides/agents-md

https://docs.openclaw.ai/concepts/memory

https://github.com/openclaw/openclaw/blob/main/docs/concepts/memory.md

https://zread.ai/instructkr/claude-code/

https://github.com/instructkr/claude-code

https://newclawtimes.com/guides/openclaw-memory-soul-md-agents-md-guide/

https://velvetshark.com/openclaw-memory-masterclass

https://blog.dailydoseofds.com/p/openclaws-memory-is-broken-heres

https://gaodalie.substack.com/p/i-studied-openclaw-memory-system

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

LLMVector DatabaseAgent MemoryClaudeContext ManagementOpenAI CodexOpenClawEmbedding Model
AI Engineer Programming
Written by

AI Engineer Programming

In the AI era, defining problems is often more important than solving them; here we explore AI's contradictions, boundaries, and possibilities.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.