Turning AI’s Short‑Term Memory into a Persistent Knowledge Base with memdir
This article examines Claude Code’s memdir system, explaining how it transforms fleeting AI conversation context into a durable, file‑based knowledge base by using markdown files as memories, a lightweight index, AI‑driven relevance selection, parallel prefetching, and careful type‑specific guidelines.
Problem: Why not use CLAUDE.md
CLAUDE.md stores static project conventions but cannot capture dynamic, session‑level knowledge such as user roles, feedback, project updates, or external references. It requires manual edits, lacks semantic search, and forces full‑context loading each time, causing the AI to start from scratch on every run.
memdir Design Overview
memdir stores four categories of information that should persist across sessions: user role/preferences, feedback, project dynamics, and reference pointers.
Source Code Location
The core implementation resides in dist/cli.js within the modules Pe, H9, and U7q, which map back to the source directory src/memdir/. Key functions and constants include:
// Path and toggle
YJ(): string // memoized, returns ~/.claude/projects/<cwd-hash>/memory/
d5(): boolean // checks if autoMemory is enabled (default true)
// Content loading
EBK(opts): string // builds system‑prompt fragment (memory + writing guide)
D57(...): string[] // builds instructions for "how to save memory"
// Core retrieval
selectRelevantMemories(query, files, signal, loaded): Promise<string[]>
Rh_(messages, deps): PrefetchHandle | undefined // prefetch entry point
// Important constants
const HD = "MEMORY.md" // index file name (always fully loaded)
const $M6 = 200 // max lines for MEMORY.md
const uC6 = 25000 // max bytes per memory (~25KB)Core Design: Files as Memories, Index as Map
Each memory is a separate markdown file; MEMORY.md acts as a lightweight index. The directory layout looks like:
~/.claude/projects/<cwd-hash>/memory/
├── MEMORY.md ← always fully loaded
├── user_role.md ← user role memory
├── feedback_no_mock.md ← feedback memory
├── project_merge_freeze.md ← project dynamics
└── ref_linear_ingest.md ← external referenceEvery memory file uses a front‑matter block where the description field serves as the "eye" for relevance filtering; only the filename and description are considered during search.
MEMORY.md Constraints
Line limit of 200 (constant $M6). Exceeding this inserts a warning and truncates the index.
Entries must be single‑line pointers of the form - [Title](file.md) — description; no front‑matter or full content is allowed.
Relevance Retrieval: AI‑in‑the‑Loop
When a user query arrives, the system builds a list of filename: description strings and calls the model with a JSON‑schema prompt ( RELEVANCE_SELECTOR_PROMPT) to obtain a list of relevant memory filenames. The call uses max_tokens: 256 and is marked with querySource: "memdir_relevance" so it does not count toward the user’s conversation turn.
async function selectRelevantMemories(query, memoryFiles, signal, alreadyLoaded) {
const fileDescriptions = memoryFiles
.filter(f => !alreadyLoaded.has(f.filename))
.map(f => `- ${f.filename}: ${f.description}`)
.join('
');
const response = await callModel({
system: RELEVANCE_SELECTOR_PROMPT,
messages: [{ role: 'user', content: `Query: ${query}
Available memories:
${fileDescriptions}` }],
max_tokens: 256,
output_format: { type: 'json_schema', schema: { selected_memories: { type: 'array', items: { type: 'string' } } } },
querySource: 'memdir_relevance'
});
return response.selected_memories.filter(f => fileSet.has(f));
}Prefetch Optimization: Running Ahead of the Main Request
To avoid latency, memdir launches memory retrieval in parallel with the main request. The prefetch entry point Rh_ performs several checks before starting:
autoMemory flag ( d5()) must be true.
Feature flag tengu_moth_copse must be enabled.
Query must be longer than a single token.
Session size must not exceed MAX_SESSION_BYTES.
If passed, it returns a PrefetchHandle containing a promise for the memory results, a settledAt timestamp, a consumedOnIteration marker, and a [Symbol.dispose] method (TypeScript 5.2) to ensure proper cleanup.
Deduplication: Avoid Re‑reading the Same Memory
A Map<string, { content, timestamp }> tracks already loaded memories. The function Lh_ checks this map and only injects unseen memories, ensuring each file contributes tokens at most once per conversation.
Four Precise Memory Types
user : stores user role, preferences, knowledge level. The description must state facts only, without negative evaluation.
feedback : stores corrections and acknowledgments. The entry must include Why: and How to apply: sections.
project : stores project dynamics, deadlines, decision context. Relative times must be converted to absolute timestamps.
reference : stores external resource pointers. Only the location is stored, not the content itself.
Code patterns, git history, debug fixes, content already in CLAUDE.md, and temporary task details are explicitly excluded. Both successful and failed outcomes must be recorded.
Team vs. Personal Memory Modes
Personal mode (default) : storage path ~/.claude/projects/<hash>/memory/, index maintained manually via MEMORY.md, private to the user.
Team mode (feature flag tengu_moth_copse enabled) : custom shared path, automatic directory scan replaces manual index, files can be git‑committed for team sharing.
Access Tracking and Hook Registration
memdir registers session‑file‑access hooks via registerSessionFileAccessHooks(). Reads, edits, and writes inside the memory directory emit telemetry events tengu_memdir_file_read, tengu_memdir_file_edit, and tengu_memdir_file_write, allowing usage monitoring.
Design Insights
File system as the primary database for AI. Files are directly manipulable by read/write/edit tools, making them first‑class citizens for the model.
AI‑driven relevance is more robust than keyword matching. A cheap 256‑token structured call yields semantic selection.
Separate index and content. A tiny MEMORY.md provides a fast map; heavy content loads on demand.
Record both failures and successes. Confirmation signals are quieter but equally important to prevent the model from becoming overly conservative.
Critical Perspective
Reliance on a single selectRelevantMemories call makes relevance a single point of failure; a failed API returns an empty list, silently disabling memory.
The hard 200‑line limit can cause frequent truncation; a smarter strategy (e.g., sorting by mtimeMs) is suggested.
Team mode lacks conflict‑resolution; concurrent edits may cause git merges.
No built‑in expiration mechanism; stale memories depend on the model’s self‑judgment.
Practical Recommendations
Use a markdown file per memory with front‑matter; the description field is crucial for retrieval quality.
Structure the relevance model’s output as { selected_files: string[] } with max_tokens = 256 for low cost and high relevance.
Keep the index under 100‑200 lines, evicting the least‑recently accessed entries.
Start memory retrieval in parallel with the main request and await the promise only when needed.
Conclusion
The memdir philosophy is: use the file system to store AI memories, let the AI decide relevance, and parallelize retrieval to hide latency. Six key takeaways are index/content separation, AI‑driven relevance, file‑system primacy, prefetch benefits, clear memory type boundaries, and the need for better expiration handling.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
James' Growth Diary
I am James, focusing on AI Agent learning and growth. I continuously update two series: “AI Agent Mastery Path,” which systematically outlines core theories and practices of agents, and “Claude Code Design Philosophy,” which deeply analyzes the design thinking behind top AI tools. Helping you build a solid foundation in the AI era.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
