Artificial Intelligence 10 min read

Why Memory Is the Next Critical Infrastructure for AI Agents

This survey reviews over 200 papers to propose a three‑dimensional classification framework for foundation‑agent memory, analyzes paradigm shifts from model‑centric to utility‑centric AI, and outlines memory substrates, cognitive mechanisms, operation strategies, learning paradigms, evaluation metrics, applications, and future research directions.

PaperAgent

Feb 15, 2026

Why Memory Is the Next Critical Infrastructure for AI Agents

AI Enters the "Second Half" – Memory Becomes a Core Infrastructure

The authors argue that AI research is undergoing a paradigm shift: the first half focused on model architecture innovation and benchmark scores, while the second half emphasizes problem definition, real‑world evaluation, and long‑term, dynamic, user‑dependent utility.

"Memory emerges as the critical solution to fill the utility gap." – Memory is now the bridge between ideal benchmarks and practical applications.

Three‑Dimensional Unified Classification Framework

The survey proposes a framework that examines agent memory from three complementary dimensions:

1. Memory Substrate – How Information Is Stored

Type          Definition                                 Typical Implementations          Pros / Cons
Internal      Stored in model weights, states, or KV cache Parameterized knowledge, latent state, KV cache   Fast access, tight integration; costly updates, catastrophic forgetting
External      Stored in vector indexes or structured stores Vector DBs, knowledge graphs, text logs   Scalable, easy to update; retrieval latency, possible noise

2. Cognitive Mechanism – How Memory Is Used

Memory Type   Function                                 Research Trend
Sensory       Short‑term retention for attention       Rapid growth (2025, multimodal/embodied)
Working       Temporary storage for task‑relevant data  Core research focus
Episodic      Stores specific experiences (time, place) Explosive growth (2025)
Semantic      Stores abstract knowledge and facts      Steady growth
Procedural    Stores skills and operation flows        Emerging hotspot

3. Memory Subject – Who Benefits

User‑Centric Memory : Stores user preferences, interaction history, personalization. Challenges: dialogue memory management, long‑term personalization, privacy.

Agent‑Centric Memory : Stores the agent’s accumulated knowledge and skills. Challenges: long‑term task execution, domain‑specific solutions, cross‑task knowledge transfer.

Memory Operation Mechanisms: From Single to Multi‑Agent Systems

Single‑agent systems perform five core operations:

Store & Index : Organize information in vector, structured, or text formats for efficient retrieval.

Load & Retrieve : Filter and rank relevant memories, inject them into the current context.

Update & Refresh : Dynamically revise memory entries to incorporate new information.

Compress & Summarize : Collapse detailed interaction histories into compact abstractions to control memory growth.

Forget & Retain : Remove outdated data while preserving high‑value knowledge.

Multi‑agent systems face additional architectural challenges, ranging from fully private memories to shared workspaces, hybrid private‑plus‑shared layers, and orchestrated (central‑controller) designs. Representative works include RecAgent, TradingGPT (private), MetaGPT, InteRecAgent (shared), Collaborative Memory, MirrorMind (hybrid), and ChatDev, MIRIX (orchestrated).

Memory Learning Strategies: From Prompting to Reinforcement Learning

Level 1 – Prompt‑Based Learning

Static Prompts : Pre‑defined rules such as hierarchical memory management in MemGPT.

Dynamic Prompts : Adjusted at inference time based on feedback, e.g., self‑reflection in Reflexion.

Level 2 – Fine‑Tuning Parameterized Strategies

Internalize memory behavior into model parameters.

Key challenges: strategy stabilization, boundary control, retrieval optimization.

Level 3 – Reinforcement Learning

Step‑Level Decisions : Learn when to store, update, or delete (e.g., Memory‑R1).

Trajectory‑Level Representations : Learn compression and summarization strategies (e.g., MemSearcher).

Cross‑Episode Memory : Accumulate reusable policies for continual learning.

Evaluation System – Beyond Accuracy

The survey categorizes evaluation metrics into three groups (shown in Figure 2) and highlights that current benchmarks mainly test static recall. Future evaluation should measure dynamic adaptation, preference drift, and safety boundaries.

Application Scenarios – 12 Major Domains Empowered by Memory

Education, scientific research, gaming & simulation, robotics, healthcare, dialogue systems, workflow automation, software engineering, information flow & recommendation, information retrieval, finance & accounting, legal consulting.

Six Future Directions

Memory ≠ storage – modern agent memory is an active cognitive architecture involving selection, compression, forgetting, and reasoning.

Context explosion drives memory design as tasks shift from single‑turn QA to long‑term interaction.

Learning memory management itself via RL will replace hand‑crafted heuristics.

Evaluation must evolve to assess dynamic adaptation, preference drift, and safety.

Hybrid architectures dominate: combine fast internal memory with scalable external stores.

Integrate memory as a core infrastructure for reliable, efficient, personalized AI agents.

Key Takeaway

This comprehensive review of 200+ papers provides a unified roadmap for understanding and building memory systems in foundation agents, positioning memory as an indispensable infrastructure for future AI agents.