How to Classify and Manage Agent Memories for Better Retrieval
This article dissects Claude Code's memory system, explains why unstructured memory degrades performance, introduces four distinct memory types with concrete examples and schema, shows how to handle expiration and retrieval strategies, and provides step‑by‑step implementation code to improve agent reliability.
1. Unstructured memory is a heap, not true memory
Many developers start with a few key facts, then keep appending every user utterance, ending up with hundreds of entries after a few months. The author identifies three problems: (1) retrieval noise grows because unrelated facts share the same vector space, (2) stale information pollutes decisions, e.g., a deadline from three months ago is still used, and (3) duplicate entries waste resources. All three stem from the lack of classification.
2. Claude Code's four memory categories
Claude Code defines four types in sections.py:183-246 and stores the type in the YAML front‑matter of each memory file.
---
name: feedback_no_mock
description: 不要在测试中 mock 数据库
type: feedback ← 类型字段
---The four types are user, feedback, project, and reference. Each type has its own "when to write, how to use, expected lifespan" design.
User type: Who the person is
The user type stores long‑term personal attributes such as profession, technical depth, interests, and goals. Example entries:
"User is a data scientist interested in observability and logging"
"User has ten years of Go experience and is new to the React part of the project"
"User has strong LLM background but cares about system reliability, so we can compare to distributed‑system concepts"
These memories are always active; they are loaded at the start of every session because the model must know the user's identity to adjust tone and depth. Their lifespan is effectively indefinite unless the user changes jobs.
Write rule: Save only information that continuously influences Claude's answers, not every trivial detail. Negative list: Do not store negative judgments or transient facts like "User is debugging a bug today".
Feedback type: User‑issued constraints
feedbackmemories have the highest priority. They capture explicit corrections, rules, and behavioral constraints.
---
name: feedback_no_mock
description: 不要在测试中 mock 数据库
type: feedback
---
不要 mock 数据库。上季度因为 mock/生产环境差异导致了一次严重的 migration 失败。
集成测试必须打真实数据库,不接受 mock。
**Why:** 上季度 mock 和生产环境差异掩盖了一次 migration 失败,造成了线上事故。
**How to apply:** 所有涉及数据库的测试,无论单元测试还是集成测试,都必须打真实数据库。The author stresses the importance of adding a Why field (reason) and a How to apply field (when to use) so the model can reason about edge cases, e.g., "Is mocking a pure‑frontend UI component a violation?".
Feedback memories stay valid until the user explicitly revokes them.
Project type: Current work context
The project type records ongoing decisions, constraints, and milestones, and it is time‑sensitive.
"Merge freeze starts on 2026‑03‑05 (Thursday); the mobile team moves to the release branch"
"Auth middleware rewrite is required because the legal team found a compliance issue; compliance overrides technical elegance"
"Current sprint focuses on the retrieval module; do not modify parsing code"
Because these facts can become obsolete quickly, the system converts relative dates to absolute dates. For example, a user says on 2026‑04‑15, "We need the retrieval module done by next Thursday." Storing it as "by next Thursday" would be ambiguous later; instead it is stored as:
项目要求:召回模块需要在 2026-04-24(周四)前跑通
**Why:** 这是这个迭代的核心 milestone,产品侧有演示要求。
**How to apply:** 评估任何改动优先级时,以是否影响召回模块 milestone 为首要考量。Reference type: Where to find external information
The reference type stores locations of external systems or documents.
"pipeline bug is tracked in Linear project INGEST"
"API latency dashboard lives at grafana.internal/d/api-latency; check it when modifying request handling"
"User feedback is organized in Notion's Feedback database"
When Claude encounters a question that requires external data, it can retrieve the appropriate reference without asking the user.
3. What should NOT be stored
The author lists several categories that belong in the negative list: code style and architecture decisions (they are already visible in the code), Git history and recent changes (authoritative source is git log or git blame), debugging processes (store the root cause as a project entry instead), transient task details, and content already covered by CLAUDE.md.
4. Retrieval strategy differences
Each memory type uses a different retrieval pattern: user: loaded in full at session start; no semantic search needed. feedback: trigger‑based retrieval; before performing an action, Claude checks for relevant feedback rules. project: task‑related retrieval plus expiration check; only memories whose absolute deadline has not passed are considered. reference: on‑demand retrieval when an external lookup is required.
5. Practical steps to adopt the design
1️⃣ Identify the information categories in your own agent.
2️⃣ Define a schema for each type (the author shows a Python dataclass with fields like expires_at, why, how_to_apply, location).
from dataclasses import dataclass
from datetime import datetime
from enum import Enum
from typing import Optional
class MemoryType(Enum):
USER = "user"
FEEDBACK = "feedback"
PROJECT = "project"
REFERENCE = "reference"
@dataclass
class Memory:
id: str
content: str
memory_type: MemoryType
description: str
created_at: datetime
expires_at: Optional[datetime] = None # required for project
why: Optional[str] = None # required for feedback/project
how_to_apply: Optional[str] = None # required for feedback/project
location: Optional[str] = None # required for reference
last_verified: Optional[datetime] = None3️⃣ Implement classification logic that forces the LLM to pick a type instead of guessing.
判断记忆类型的规则:
- user:关于用户本身的长期特征(角色、专业背景、偏好)
- feedback:用户给出的规则或纠正("不要……"/"我们规定……")
- project:当前工作的状态、约束、决策(有时效性,必须用绝对日期)
- reference:外部系统的位置信息
不确定时优先选 project 而不是 user。
project 记忆会过期,不会长期污染记忆库。4️⃣ Periodically purge expired project memories.
def load_active_memories(memory_dir: Path) -> list[Memory]:
memories = load_all_memories(memory_dir)
now = datetime.now()
active = []
expired_ids = []
for m in memories:
if m.memory_type == MemoryType.PROJECT:
if m.expires_at and m.expires_at < now:
expired_ids.append(m.id)
continue
active.append(m)
if expired_ids:
log_expired(expired_ids)
return activeIn the author's own project, applying this classification reduced the memory store size by about 40% and improved retrieval accuracy because user memories are always loaded and feedback memories have the highest priority.
6. Interview answer outline
When asked about memory design, the candidate should (1) state the four‑type classification in ~20 seconds, (2) explain the expiration handling for project memories in ~15 seconds, (3) mention what should not be stored in ~15 seconds, and (4) cite the 40 % size reduction and accuracy boost as concrete results in the final ~20 seconds.
7. Closing
This is the third part of the Claude Code Memory series: the first covered "what to store", the second "when to store", and this article focuses on "what type to store". Understanding the three layers—classification, timing, and lifespan—completes a robust memory design.
Wu Shixiong's Large Model Academy
We continuously share large‑model know‑how, helping you master core skills—LLM, RAG, fine‑tuning, deployment—from zero to job offer, tailored for career‑switchers, autumn recruiters, and those seeking stable large‑model positions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
