Artificial Intelligence 19 min read

How to Classify and Manage Agent Memories for Better Retrieval

This article dissects Claude Code's memory system, explains why unstructured memory degrades performance, introduces four distinct memory types with concrete examples and schema, shows how to handle expiration and retrieval strategies, and provides step‑by‑step implementation code to improve agent reliability.

Wu Shixiong's Large Model Academy

Apr 22, 2026

How to Classify and Manage Agent Memories for Better Retrieval

1. Unstructured memory is a heap, not true memory

Many developers start with a few key facts, then keep appending every user utterance, ending up with hundreds of entries after a few months. The author identifies three problems: (1) retrieval noise grows because unrelated facts share the same vector space, (2) stale information pollutes decisions, e.g., a deadline from three months ago is still used, and (3) duplicate entries waste resources. All three stem from the lack of classification.

2. Claude Code's four memory categories

Claude Code defines four types in sections.py:183-246 and stores the type in the YAML front‑matter of each memory file.

---
 name: feedback_no_mock
 description: 不要在测试中 mock 数据库
 type: feedback   ← 类型字段
---

The four types are user, feedback, project, and reference. Each type has its own "when to write, how to use, expected lifespan" design.

User type: Who the person is

The user type stores long‑term personal attributes such as profession, technical depth, interests, and goals. Example entries:

"User is a data scientist interested in observability and logging"

"User has ten years of Go experience and is new to the React part of the project"

"User has strong LLM background but cares about system reliability, so we can compare to distributed‑system concepts"

These memories are always active; they are loaded at the start of every session because the model must know the user's identity to adjust tone and depth. Their lifespan is effectively indefinite unless the user changes jobs.

Write rule: Save only information that continuously influences Claude's answers, not every trivial detail. Negative list: Do not store negative judgments or transient facts like "User is debugging a bug today".

Feedback type: User‑issued constraints

feedback

memories have the highest priority. They capture explicit corrections, rules, and behavioral constraints.

---
 name: feedback_no_mock
 description: 不要在测试中 mock 数据库
 type: feedback
---

不要 mock 数据库。上季度因为 mock/生产环境差异导致了一次严重的 migration 失败。
集成测试必须打真实数据库，不接受 mock。

**Why:** 上季度 mock 和生产环境差异掩盖了一次 migration 失败，造成了线上事故。

**How to apply:** 所有涉及数据库的测试，无论单元测试还是集成测试，都必须打真实数据库。

The author stresses the importance of adding a Why field (reason) and a How to apply field (when to use) so the model can reason about edge cases, e.g., "Is mocking a pure‑frontend UI component a violation?".

Feedback memories stay valid until the user explicitly revokes them.

Project type: Current work context

The project type records ongoing decisions, constraints, and milestones, and it is time‑sensitive.

"Merge freeze starts on 2026‑03‑05 (Thursday); the mobile team moves to the release branch"

"Auth middleware rewrite is required because the legal team found a compliance issue; compliance overrides technical elegance"

"Current sprint focuses on the retrieval module; do not modify parsing code"

Because these facts can become obsolete quickly, the system converts relative dates to absolute dates. For example, a user says on 2026‑04‑15, "We need the retrieval module done by next Thursday." Storing it as "by next Thursday" would be ambiguous later; instead it is stored as:

项目要求：召回模块需要在 2026-04-24（周四）前跑通

**Why:** 这是这个迭代的核心 milestone，产品侧有演示要求。

**How to apply:** 评估任何改动优先级时，以是否影响召回模块 milestone 为首要考量。

Project memory: relative vs absolute dates

Reference type: Where to find external information

The reference type stores locations of external systems or documents.

"pipeline bug is tracked in Linear project INGEST"

"API latency dashboard lives at grafana.internal/d/api-latency; check it when modifying request handling"

"User feedback is organized in Notion's Feedback database"

When Claude encounters a question that requires external data, it can retrieve the appropriate reference without asking the user.

3. What should NOT be stored

The author lists several categories that belong in the negative list: code style and architecture decisions (they are already visible in the code), Git history and recent changes (authoritative source is git log or git blame), debugging processes (store the root cause as a project entry instead), transient task details, and content already covered by CLAUDE.md.

4. Retrieval strategy differences

Each memory type uses a different retrieval pattern: user: loaded in full at session start; no semantic search needed. feedback: trigger‑based retrieval; before performing an action, Claude checks for relevant feedback rules. project: task‑related retrieval plus expiration check; only memories whose absolute deadline has not passed are considered. reference: on‑demand retrieval when an external lookup is required.

5. Practical steps to adopt the design

1️⃣ Identify the information categories in your own agent. 2️⃣ Define a schema for each type (the author shows a Python dataclass with fields like expires_at, why, how_to_apply, location).

from dataclasses import dataclass
from datetime import datetime
from enum import Enum
from typing import Optional

class MemoryType(Enum):
    USER = "user"
    FEEDBACK = "feedback"
    PROJECT = "project"
    REFERENCE = "reference"

@dataclass
class Memory:
    id: str
    content: str
    memory_type: MemoryType
    description: str
    created_at: datetime
    expires_at: Optional[datetime] = None  # required for project
    why: Optional[str] = None            # required for feedback/project
    how_to_apply: Optional[str] = None   # required for feedback/project
    location: Optional[str] = None       # required for reference
    last_verified: Optional[datetime] = None

3️⃣ Implement classification logic that forces the LLM to pick a type instead of guessing.

判断记忆类型的规则：
 - user：关于用户本身的长期特征（角色、专业背景、偏好）
 - feedback：用户给出的规则或纠正（"不要……"/"我们规定……"）
 - project：当前工作的状态、约束、决策（有时效性，必须用绝对日期）
 - reference：外部系统的位置信息

不确定时优先选 project 而不是 user。
project 记忆会过期，不会长期污染记忆库。

4️⃣ Periodically purge expired project memories.

def load_active_memories(memory_dir: Path) -> list[Memory]:
    memories = load_all_memories(memory_dir)
    now = datetime.now()
    active = []
    expired_ids = []
    for m in memories:
        if m.memory_type == MemoryType.PROJECT:
            if m.expires_at and m.expires_at < now:
                expired_ids.append(m.id)
                continue
        active.append(m)
    if expired_ids:
        log_expired(expired_ids)
    return active

In the author's own project, applying this classification reduced the memory store size by about 40% and improved retrieval accuracy because user memories are always loaded and feedback memories have the highest priority.

Memory design improvement: from heap to layered management

6. Interview answer outline

When asked about memory design, the candidate should (1) state the four‑type classification in ~20 seconds, (2) explain the expiration handling for project memories in ~15 seconds, (3) mention what should not be stored in ~15 seconds, and (4) cite the 40 % size reduction and accuracy boost as concrete results in the final ~20 seconds.

7. Closing

This is the third part of the Claude Code Memory series: the first covered "what to store", the second "when to store", and this article focuses on "what type to store". Understanding the three layers—classification, timing, and lifespan—completes a robust memory design.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Memory Management Python LLM classification Retrieval Agent Memory

Written by

Wu Shixiong's Large Model Academy

We continuously share large‑model know‑how, helping you master core skills—LLM, RAG, fine‑tuning, deployment—from zero to job offer, tailored for career‑switchers, autumn recruiters, and those seeking stable large‑model positions.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.