Turning AI’s Short‑Term Memory into a Persistent Knowledge Base with memdir

This article examines Claude Code’s memdir system, explaining how it transforms fleeting AI conversation context into a durable, file‑based knowledge base by using markdown files as memories, a lightweight index, AI‑driven relevance selection, parallel prefetching, and careful type‑specific guidelines.

James' Growth Diary
James' Growth Diary
James' Growth Diary
Turning AI’s Short‑Term Memory into a Persistent Knowledge Base with memdir

Problem: Why not use CLAUDE.md

CLAUDE.md stores static project conventions but cannot capture dynamic, session‑level knowledge such as user roles, feedback, project updates, or external references. It requires manual edits, lacks semantic search, and forces full‑context loading each time, causing the AI to start from scratch on every run.

memdir Design Overview

memdir stores four categories of information that should persist across sessions: user role/preferences, feedback, project dynamics, and reference pointers.

Source Code Location

The core implementation resides in dist/cli.js within the modules Pe, H9, and U7q, which map back to the source directory src/memdir/. Key functions and constants include:

// Path and toggle
YJ(): string          // memoized, returns ~/.claude/projects/<cwd-hash>/memory/
 d5(): boolean        // checks if autoMemory is enabled (default true)

// Content loading
EBK(opts): string     // builds system‑prompt fragment (memory + writing guide)
 D57(...): string[]   // builds instructions for "how to save memory"

// Core retrieval
selectRelevantMemories(query, files, signal, loaded): Promise<string[]>
Rh_(messages, deps): PrefetchHandle | undefined // prefetch entry point

// Important constants
const HD = "MEMORY.md"               // index file name (always fully loaded)
const $M6 = 200                       // max lines for MEMORY.md
const uC6 = 25000                     // max bytes per memory (~25KB)

Core Design: Files as Memories, Index as Map

Each memory is a separate markdown file; MEMORY.md acts as a lightweight index. The directory layout looks like:

~/.claude/projects/<cwd-hash>/memory/
├── MEMORY.md               ← always fully loaded
├── user_role.md            ← user role memory
├── feedback_no_mock.md     ← feedback memory
├── project_merge_freeze.md ← project dynamics
└── ref_linear_ingest.md    ← external reference

Every memory file uses a front‑matter block where the description field serves as the "eye" for relevance filtering; only the filename and description are considered during search.

MEMORY.md Constraints

Line limit of 200 (constant $M6). Exceeding this inserts a warning and truncates the index.

Entries must be single‑line pointers of the form - [Title](file.md) — description; no front‑matter or full content is allowed.

Relevance Retrieval: AI‑in‑the‑Loop

When a user query arrives, the system builds a list of filename: description strings and calls the model with a JSON‑schema prompt ( RELEVANCE_SELECTOR_PROMPT) to obtain a list of relevant memory filenames. The call uses max_tokens: 256 and is marked with querySource: "memdir_relevance" so it does not count toward the user’s conversation turn.

async function selectRelevantMemories(query, memoryFiles, signal, alreadyLoaded) {
  const fileDescriptions = memoryFiles
    .filter(f => !alreadyLoaded.has(f.filename))
    .map(f => `- ${f.filename}: ${f.description}`)
    .join('
');

  const response = await callModel({
    system: RELEVANCE_SELECTOR_PROMPT,
    messages: [{ role: 'user', content: `Query: ${query}

Available memories:
${fileDescriptions}` }],
    max_tokens: 256,
    output_format: { type: 'json_schema', schema: { selected_memories: { type: 'array', items: { type: 'string' } } } },
    querySource: 'memdir_relevance'
  });
  return response.selected_memories.filter(f => fileSet.has(f));
}

Prefetch Optimization: Running Ahead of the Main Request

To avoid latency, memdir launches memory retrieval in parallel with the main request. The prefetch entry point Rh_ performs several checks before starting:

autoMemory flag ( d5()) must be true.

Feature flag tengu_moth_copse must be enabled.

Query must be longer than a single token.

Session size must not exceed MAX_SESSION_BYTES.

If passed, it returns a PrefetchHandle containing a promise for the memory results, a settledAt timestamp, a consumedOnIteration marker, and a [Symbol.dispose] method (TypeScript 5.2) to ensure proper cleanup.

Deduplication: Avoid Re‑reading the Same Memory

A Map<string, { content, timestamp }> tracks already loaded memories. The function Lh_ checks this map and only injects unseen memories, ensuring each file contributes tokens at most once per conversation.

Four Precise Memory Types

user : stores user role, preferences, knowledge level. The description must state facts only, without negative evaluation.

feedback : stores corrections and acknowledgments. The entry must include Why: and How to apply: sections.

project : stores project dynamics, deadlines, decision context. Relative times must be converted to absolute timestamps.

reference : stores external resource pointers. Only the location is stored, not the content itself.

Code patterns, git history, debug fixes, content already in CLAUDE.md, and temporary task details are explicitly excluded. Both successful and failed outcomes must be recorded.

Team vs. Personal Memory Modes

Personal mode (default) : storage path ~/.claude/projects/<hash>/memory/, index maintained manually via MEMORY.md, private to the user.

Team mode (feature flag tengu_moth_copse enabled) : custom shared path, automatic directory scan replaces manual index, files can be git‑committed for team sharing.

Access Tracking and Hook Registration

memdir registers session‑file‑access hooks via registerSessionFileAccessHooks(). Reads, edits, and writes inside the memory directory emit telemetry events tengu_memdir_file_read, tengu_memdir_file_edit, and tengu_memdir_file_write, allowing usage monitoring.

Design Insights

File system as the primary database for AI. Files are directly manipulable by read/write/edit tools, making them first‑class citizens for the model.

AI‑driven relevance is more robust than keyword matching. A cheap 256‑token structured call yields semantic selection.

Separate index and content. A tiny MEMORY.md provides a fast map; heavy content loads on demand.

Record both failures and successes. Confirmation signals are quieter but equally important to prevent the model from becoming overly conservative.

Critical Perspective

Reliance on a single selectRelevantMemories call makes relevance a single point of failure; a failed API returns an empty list, silently disabling memory.

The hard 200‑line limit can cause frequent truncation; a smarter strategy (e.g., sorting by mtimeMs) is suggested.

Team mode lacks conflict‑resolution; concurrent edits may cause git merges.

No built‑in expiration mechanism; stale memories depend on the model’s self‑judgment.

Practical Recommendations

Use a markdown file per memory with front‑matter; the description field is crucial for retrieval quality.

Structure the relevance model’s output as { selected_files: string[] } with max_tokens = 256 for low cost and high relevance.

Keep the index under 100‑200 lines, evicting the least‑recently accessed entries.

Start memory retrieval in parallel with the main request and await the promise only when needed.

Conclusion

The memdir philosophy is: use the file system to store AI memories, let the AI decide relevance, and parallelize retrieval to hide latency. Six key takeaways are index/content separation, AI‑driven relevance, file‑system primacy, prefetch benefits, clear memory type boundaries, and the need for better expiration handling.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Prompt engineeringKnowledge Basefile systemprefetchClaude CodeAI memorymemdirrelevance retrieval
James' Growth Diary
Written by

James' Growth Diary

I am James, focusing on AI Agent learning and growth. I continuously update two series: “AI Agent Mastery Path,” which systematically outlines core theories and practices of agents, and “Claude Code Design Philosophy,” which deeply analyzes the design thinking behind top AI tools. Helping you build a solid foundation in the AI era.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.