Artificial Intelligence 20 min read

Which Open‑Source Agent Memory Engine Wins? Deep Dive into Mem0, Graphiti & Cognee

This article examines the limitations of LLM short‑term context windows and compares three open‑source long‑term memory frameworks—Mem0, Graphiti, and Cognee—by detailing their architectures, storage modes, integration steps, code examples, strengths, drawbacks, and practical selection guidance for building smarter AI agents.

AI Large Model Application Practice

Dec 1, 2025

Which Open‑Source Agent Memory Engine Wins? Deep Dive into Mem0, Graphiti & Cognee

Why Long‑Term Memory Matters for LLM‑Based Agents

Large language models (LLMs) are stateless and can only attend to a limited context window; once information falls outside this window, the model forgets user preferences or earlier conversation details, degrading the user experience in multi‑turn or multi‑day interactions. Adding a persistent memory layer lets agents store and retrieve important facts, effectively giving LLMs a writable notebook that overcomes context limits.

Memory Types and Their Relation to RAG

AI memory can be split into declarative memory (facts such as a user’s name or preferences) and procedural memory (how to perform tasks, e.g., a step‑by‑step ticket‑booking flow). Compared with Retrieval‑Augmented Generation (RAG), memory acts like a constantly editable notebook, while RAG resembles a relatively static reference library; most real‑world systems combine both.

Memory as a Pillar of Context Engineering

When constructing prompts, long‑term memory becomes one of the key context sources alongside system instructions, tool definitions, few‑shot examples, recent dialogue, and RAG knowledge. Designers must decide what to persist, when to store it, who decides, how to retrieve it, and how to prune or merge entries.

PART 01 – Mem0: Plug‑and‑Play Memory Engine

Mem0 provides a lightweight memory service with two storage modes: vector storage for semantic similarity search and graph storage for entity‑relationship data. The two can be combined into a hybrid system.

Key workflow:

When adding a memory, the LLM extracts key facts and stores them as vectors.

If graph mode is enabled, entities and relationships are also extracted and saved to a graph database.

Mem0 performs conflict detection to avoid duplicate or contradictory entries.

Based on detection results, it adds, updates, or deletes memories.

Important notes:

Conflict detection requires LLM‑based fact extraction; raw text cannot be stored directly.

Graph extraction needs explicit graph‑DB configuration.

In hybrid retrieval, vector results narrow the graph search space.

Quick start example (Python):

from mem0 import Memory

# Initialize client (API key assumed set)
mem = Memory()

# Add a user preference
mem.add([
    {"role": "user", "content": "我更喜欢自然风光而不是人文景观"}
], user_id="user_123")

# Retrieve later
results = mem.search('请帮我设计旅游方案', user_id="user_123")
print(results["results"][0]["memory"])

The retrieved fact can be injected into the LLM prompt, enabling the agent to act as if it remembers the user’s preference.

PART 02 – Graphiti: Temporal Knowledge‑Graph Memory

Graphiti (open‑sourced by Zep) treats each incoming piece of information as an episode . Episodes are parsed by an LLM to extract entities, relationships, and timestamps, which are then incrementally merged into a time‑aware knowledge graph.

Core steps:

Episode recording : Add raw text, JSON, or other data via add_episode.

Incremental parsing : LLM extracts nodes, edges, and temporal markers.

Graph update : New facts are added; conflicting facts are marked as expired rather than overwritten, preserving provenance.

Mixed search : Queries combine semantic vector search, keyword matching, and graph‑based reasoning.

Quick start snippet:

episodes = [
    {"content": "卡玛拉·哈里斯是加利福尼亚州总检察长。", "type": EpisodeType.text, "description": "播客文字稿"},
    {"content": "作为总检察长，哈里斯的任期是从 2011 年 1 月 3 日到 2017 年 1 月 3 日", "type": EpisodeType.text, "description": "播客文字稿"},
    {"content": {"name": "加文·纽森", "position": "州长", "term_start": "2019年1月7日", "term_end": "至今"}, "type": EpisodeType.json, "description": "播客元数据"}
]
for i, episode in enumerate(episodes):
    await graphiti.add_episode(
        name=f'Freakonomics Radio {i}',
        episode_body=episode['content'] if isinstance(episode['content'], str) else json.dumps(episode['content']),
        source=episode['type'],
        source_description=episode['description'],
        reference_time=datetime.now(timezone.utc)
    )

results = await graphiti.search('谁是加利福尼亚州总检察长？')

This demonstrates how Graphiti builds a temporal knowledge graph that can answer time‑sensitive questions.

PART 03 – Cognee: Vector + Graph + Ontology Platform

Cognee combines three storage layers:

Relation store : tracks documents, data chunks, and their provenance.

Vector store : holds text embeddings for semantic similarity.

Graph store : records entities and their relationships.

Its architecture revolves around Tasks (e.g., entity extraction), Pipelines (chaining tasks), and DataPoints (Pydantic models representing entities).

Typical workflow:

Add : cognee.add(...) ingests raw data and prepares it for processing.

Cognify : cognee.cognify([...]) extracts entities/relations and builds the knowledge graph.

Memify (optional): enriches the graph with semantic vectors.

Search : cognee.search(...) performs hybrid vector‑graph queries.

Quick start example:

import cognee

# 1. Add a user profile
await cognee.add("我叫张伟，是公司的销售总监。", dataset_name="user_profile")

# 2. Build the graph
await cognee.cognify(["user_profile"])

# 3. Query the knowledge
results = await cognee.search("公司的销售总监是谁？")

Cognee also supports RDF/OWL ontologies, multimodal inputs (text, images, audio), and extensive customization, making it suitable for enterprise‑scale knowledge bases.

PART 04 – Comparative Guidance

All three projects follow an “open‑core + commercial hosting” model, but they differ in architecture, focus, and complexity:

Mem0 : lightweight, vector‑first with optional graph mode; fastest to integrate, minimal configuration, but limited customizability.

Graphiti : centered on a temporal knowledge graph; excels at event‑driven, relational, and time‑based reasoning; requires graph‑DB expertise.

Cognee : most comprehensive, combining vector, graph, and ontology layers; highly flexible for large‑scale, multi‑modal knowledge brains; higher learning curve and operational overhead.

Selection recommendations:

For quick addition of memory to a conversational agent, choose Mem0 .

For scenarios involving events, timelines, or complex relational queries (e.g., CRM, HR), choose Graphiti .

For building an enterprise‑level knowledge hub with ontology support and multimodal data, choose Cognee .

Ultimately, the decision should balance data type, business complexity, team expertise, and maintenance cost.

LLM Agent Memory long-term memory Mem0 cognee Graphiti

Written by

AI Large Model Application Practice

Focused on deep research and development of large-model applications. Authors of "RAG Application Development and Optimization Based on Large Models" and "MCP Principles Unveiled and Development Guide". Primarily B2B, with B2C as a supplement.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Why Long‑Term Memory Matters for LLM‑Based Agents

Memory Types and Their Relation to RAG

Memory as a Pillar of Context Engineering

PART 01 – Mem0: Plug‑and‑Play Memory Engine

PART 02 – Graphiti: Temporal Knowledge‑Graph Memory

PART 03 – Cognee: Vector + Graph + Ontology Platform

PART 04 – Comparative Guidance

AI Large Model Application Practice

How this landed with the community

Was this worth your time?

0 Comments

PART 03 – Cognee: Vector + Graph + Ontology Platform