Artificial Intelligence 13 min read

How Clawdbot Achieves Persistent, Local Memory for LLM Agents

Clawdbot implements a fully local, persistent memory system for LLM agents by storing context and long‑term knowledge in editable Markdown files, indexing them with SQLite‑vec and FTS5, supporting multi‑agent isolation, compression, pruning, and configurable session lifecycles to maintain efficient, cost‑effective interactions.

PaperAgent

Jan 28, 2026

How Clawdbot Achieves Persistent, Local Memory for LLM Agents

Overview

Clawdbot is an LLM‑driven autonomous agent that can manage real‑world tasks such as email, calendar events, and flight check‑ins. Its distinguishing feature is a 24/7 persistent memory system that stores knowledge locally, giving users full ownership of context and skills.

How Context Is Built

Each request to the model receives four components:

[0] System prompt (static + conditional instructions)
[1] Project context (files like AGENTS.md, SOUL.md, etc.)
[2] Conversation history (messages, tool calls, compressed summaries)
[3] Current user message

The project context consists of editable Markdown files that live in the agent’s workspace: AGENTS.md – agent directives and memory guidelines SOUL.md – personality and tone USER.md – user‑specific information TOOLS.md – external tool usage instructions

Context vs. Memory

Context is the transient view the model has for a single request. It is short‑lived, bounded by the model’s token window (e.g., 200 K tokens), and each token incurs API cost.

context = system_prompt + conversation_history + tool_results + attachments

Memory is the persistent data stored on disk. It is durable across restarts, unbounded in size, cheap (no API cost), and searchable.

memory = MEMORY.md + memory/*.md + session_records

Persistent – survives days or months

Unbounded – can grow indefinitely

Cheap – no token fees

Searchable – indexed for semantic retrieval

Memory Access Tools

memory_search

Finds relevant memory entries across all files.

memory_get

Retrieves the specific content of a previously found entry.

Writing Memory

There is no dedicated memory_write tool. Instead, the standard write and edit tools modify Markdown files, which are then automatically re‑indexed. The destination is driven by AGENTS.md directives:

Daily notes → memory/YYYY-MM-DD.md Long‑term facts, preferences, decisions → MEMORY.md Experience & lessons → AGENTS.md or

TOOLS.md

Memory Storage Architecture

All memory files reside in the agent’s workspace (default ~/clawd/). The system uses a two‑layer approach:

Layer 1 – Daily Logs

Append‑only files named memory/YYYY-MM-DD.md capture day‑by‑day notes whenever the agent is instructed to remember something.

Layer 2 – Long‑Term Knowledge

Curated, durable knowledge is stored in MEMORY.md. Important events, ideas, decisions, and lessons are written here.

How Agents Locate Memory

The AGENTS.md file contains directives that tell the agent where to write based on trigger conditions (e.g., “remember this”, “persistent fact”, “experience”).

Indexing and Search

Memory files are indexed using two SQLite extensions:

sqlite‑vec enables vector similarity search directly inside SQLite.

FTS5 provides BM25 keyword matching. The combination allows mixed semantic + keyword search from a single lightweight database file.

Search Strategy

When a query is issued, Clawdbot runs both vector and BM25 searches in parallel and combines the scores:

finalScore = (0.7 * vectorScore) + (0.3 * textScore)

Results with a combined score below minScore (default 0.35) are filtered out. The weighting favors semantic similarity while still capturing exact term matches.

Multi‑Agent Memory Isolation

Each agent runs in its own workspace with separate Markdown files and a distinct SQLite index. The key agentId + workspaceDir ensures that memory searches are scoped to the current agent, preventing accidental cross‑agent leakage.

Agents can read each other’s memory only if a strict sandbox is disabled and absolute paths are used.

Compression

LLMs have context windows (Claude 200 K tokens, GPT‑5.1 1 M tokens). When the window is approached, Clawdbot compresses older dialogue into a concise entry while keeping recent messages intact.

Automatic Compression

Triggered near the token limit

Original request is retried with the compressed context

Manual Compression

Users can invoke /compact with a custom prompt, e.g.: /compact 关注决策和未决问题 Compressed summaries are persisted to disk as JSONL records, so future sessions start from the compressed history.

Memory Refresh

Because LLM‑based compression is lossy, Clawdbot can refresh memory before compression. The behavior is configured in clawdbot.yaml or clawdbot.json.

Pruning

Tool results can be massive (e.g., a 50 k‑character log from exec). Pruning trims these outputs without rewriting history, though it is a lossy operation.

Cache TTL Pruning

Anthropic caches prompt prefixes for up to five minutes to reduce cost. After the TTL expires, the cache is cleared and the next request incurs the full cost. TTL pruning detects expiration and removes stale tool results before the next request.

Session Lifecycle

Sessions reset according to configurable rules, providing natural memory boundaries. Default is a daily reset at 4 AM local time, but other modes exist: daily – reset at a fixed time idle – reset after N minutes of inactivity daily+idle – whichever occurs first

Session Memory Hook

Running /new starts a fresh session and can automatically save the current context.

Conclusion

Transparency beats black‑box – memory is plain Markdown, editable and version‑controlled.

Search beats injection – agents retrieve relevant snippets instead of stuffing everything into the prompt, keeping context focused and cheap.

Persistence beats session‑only storage – important information lives on disk and survives compression.

Hybrid search beats single‑mode – combining vector similarity with BM25 captures both semantic meaning and exact term matches.

blog: https://manthanguptaa.in/posts/clawdbot_memory/

vector search LLM agents local storage persistent memory Context Compression multi-agent isolation sqlite-vec

Written by

PaperAgent

Daily updates, analyzing cutting-edge AI research papers

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Overview

How Context Is Built

Context vs. Memory

Memory Access Tools

memory_search

memory_get

Writing Memory

Memory Storage Architecture

Layer 1 – Daily Logs

Layer 2 – Long‑Term Knowledge

How Agents Locate Memory

Indexing and Search

Search Strategy

Multi‑Agent Memory Isolation

Compression

Automatic Compression

Manual Compression

Memory Refresh

Pruning

Cache TTL Pruning

Session Lifecycle

Session Memory Hook

Conclusion

PaperAgent

How this landed with the community

Was this worth your time?

0 Comments

Layer 1 – Daily Logs

Layer 2 – Long‑Term Knowledge