Artificial Intelligence 16 min read

How to Add Persistent Long‑Term Memory to LangGraph Agents with Trustcall

This article explains how to integrate durable long‑term memory into LangGraph agents, covering memory types, their coordination, limitations of native LangGraph storage, and a step‑by‑step implementation using Trustcall’s schema‑driven extractors for both user profiles and paper collections.

AI Waka

Feb 27, 2026

How to Add Persistent Long‑Term Memory to LangGraph Agents with Trustcall

Why Long‑Term Memory Matters for Agents

Most LLM agents are stateless: they reason well in the moment but forget everything after execution. When agents run longer workflows and repeat interactions, this statelessness becomes a structural bottleneck. Persistent memory enables personalization, continuous improvement, and coordination over time, turning reactive pipelines into adaptive systems.

Core Memory Types in Agent Architectures

Four complementary memory categories are commonly discussed in cognitive‑inspired frameworks such as CoALA:

Working memory (context window) : The immediate token window used for current reasoning, recent dialogue turns, task instructions, and retrieved facts. It is the most limited and expensive resource.

Semantic memory (facts & knowledge) : Refined knowledge stored as embeddings, structured facts, or knowledge graphs. It does not retain the full provenance of how the knowledge was learned.

Episodic memory (interaction history) : Ordered sequences of events—full dialogues, task trajectories, observations—useful for audit trails, debugging, and reflective loops.

Procedural memory (skills & behavior) : Knowledge about how to act, encoded in model weights, system prompts, or tool/function registries. It changes slowly and usually requires explicit updates.

Coordinating Memory Types

Effective agents move information between these stores:

Semantic → Working : Retrieve relevant facts before reasoning (classic RAG pattern).

Working → Episodic : Persist events that occurred during an interaction.

Episodic → Semantic : Summarize experiences into stable facts (e.g., Generative Agents’ reflection step).

Procedural + Semantic : Apply skills to knowledge (e.g., using a tool to act on a retrieved requirement).

Time dimension : Track when facts are valid to avoid stale information.

Practical Limits of LangGraph’s Native Memory

LangGraph provides Checkpointers for short‑term session context, which works well for single‑turn dialogues. For cross‑session persistence, LangGraph offers a Store API (e.g., InMemoryStore) that handles basic key‑value persistence but lacks:

Structured extraction from raw messages.

Intelligent merging or deduplication.

Schema validation, leading to drift over time.

Automated pipelines for prompting, parsing, validation, retrieval, and merging.

These gaps are where Trustcall adds value.

Using Trustcall for Production‑Grade Memory

Profile‑Based Memory (User Preferences)

Define a Pydantic model for the user profile:

from pydantic import BaseModel, Field
from typing import List

class UserProfile(BaseModel):
    interests: List[str] = Field(default_factory=list, description="User's topics of interest")
    expertise_level: str = Field(description="User's AI expertise level (beginner, intermediate, expert)")

Create a Trustcall extractor that enforces this schema:

from trustcall import create_extractor
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o")
profile_extractor = create_extractor(
    llm,
    tools=[UserProfile],
    tool_choice="UserProfile",
)

Extract and merge profiles across interactions:

# First interaction
query1 = "I'm interested in Transformer architecture and I'm an intermediate AI learner."
res1 = profile_extractor.invoke({"messages": [("user", query1)]})
# res1 => {"interests": ["Transformer architecture"], "expertise_level": "intermediate"}

# Subsequent interaction
query2 = "I also want to learn about quantum computing."
existing_profile = res1["responses"][0]
res2 = profile_extractor.invoke({
    "messages": [("user", query2)],
    "existing": existing_profile,
})
# res2 merges the new interest into the existing profile

Collection‑Based Memory (Growing Paper Lists)

Define a collection model:

class PaperCollection(BaseModel):
    topic: str = Field(description="Topic of the collection")
    papers: List[str] = Field(description="List of paper titles in the collection")

Create an extractor that allows new collections to be inserted:

collection_extractor = create_extractor(
    llm,
    tools=[PaperCollection],
    tool_choice="PaperCollection",
    enable_inserts=True,  # permits creation of new collection instances
)

Update existing collections while adding new ones:

# Existing collection
existing_collections = [
    PaperCollection(topic="NLP", papers=["Attention is All You Need"])
]

query = "Add 'BERT: Pre-training of Deep Bidirectional Transformers' to my NLP collection and create a new 'Quantum' collection with 'Quantum Computing in the NISQ era'."
res = collection_extractor.invoke({
    "messages": [("user", query)],
    "existing": existing_collections,
})

Full Integration Workflow

Profile Management Node : Uses the profile extractor to update user interests and expertise level; stores the result in the graph state and persists it via LangGraph checkpointers.

Recommendation Node : Reads the updated profile, queries an external source (e.g., arXiv) for relevant papers, and returns candidates.

Collection Management Node : Uses the collection extractor with enable_inserts=True and the existing parameter to merge new papers into existing topics or create new collections, keeping a growing knowledge base.

The graph state acts as the durable layer, while Trustcall guarantees that every update conforms to the defined Pydantic schemas, performs validation, and handles merging logic.