Build a Graph‑Based Memory ChatBot with MemOS and LangChain
This guide walks through the MemOS open‑source memory framework, showing how to create a graph‑based memory ChatBot, add and retrieve memories incrementally, enable asynchronous memory reorganization, and seamlessly integrate the system into LangChain 1.x agents using middleware.
MemOS is an open‑source memory engine that stores information as a graph (TreeTextMemory) and can be plugged into LangChain 1.x agents to provide long‑term, structured memory.
Part 01 – Quick start: a memory‑enabled ChatBot
The core components are:
Memory framework: MemOS open‑source framework
LLM / embedding model: any public‑cloud model API
Development framework: none required for a simple bot
Graph database: Neo4j Desktop
First, initialise a SingleCubeView (the memory container) using the configuration from .env:
EXAMPLE_CUBE_ID = "example_cube_id"
components = init_server() # initialise components
cube = SingleCubeView(
cube_id=EXAMPLE_CUBE_ID,
naive_mem_cube=components["naive_mem_cube"],
mem_reader=components["mem_reader"],
mem_scheduler=components["mem_scheduler"],
logger=logger,
searcher=components["searcher"],
)Adding a conversation to memory is done by creating an APIADDRequest and calling cube.add_memories:
EXAMPLE_USER_ID = "example_user_id"
writable_cube_ids = [EXAMPLE_CUBE_ID]
conversation = [
{"role": "user", "content": "我叫张明,在北京做程序员"},
{"role": "assistant", "content": "你好张明!程序员是个不错的职业"},
]
add_req = APIADDRequest(
user_id=EXAMPLE_USER_ID,
messages=conversation,
writable_cube_ids=writable_cube_ids,
)
cube.add_memories(add_req)Memory retrieval uses cube.search_memories and the results can be injected into a system prompt:
query = "用户是做什么工作的?"
results = cube.search_memories(
APISearchRequest(
user_id=EXAMPLE_USER_ID,
readable_cube_ids=[EXAMPLE_CUBE_ID],
query=query,
)
)
for mem_item in results["text_mem"][0]["memories"]:
print(f"- {mem_item.get('memory', '')}")The resulting ChatBot can keep user‑specific memories across sessions, as demonstrated by a sample command‑line interaction.
Incremental memory addition
To avoid re‑processing the whole conversation each turn, a pointer records the last memorised position; only new turns after the pointer are added, reducing latency and avoiding duplicate entries.
Part 02 – Automatic memory reorganisation
MemOS decouples memory addition from graph reorganisation. Enabling the background reorganisation is as simple as setting MOS_ENABLE_REORGANIZE=true in .env. When enough nodes (default 20) are present, a background thread clusters related memories, creates summary nodes, and adds edges such as PARENT , MERGE_TO , and FOLLOWS to build a richer knowledge graph.
Part 03 – Integrating MemOS into a LangChain 1.x Agent
LangChain’s built‑in InMemoryStore cannot represent graph relationships or perform complex reasoning. MemOS provides a true graph‑based memory that can be injected non‑intrusively via LangChain’s middleware mechanism.
The middleware defines three hooks:
before_agent : retrieve relevant memories once at the start of an agent run.
wrap_model_call : inject the retrieved memories into the system prompt before the LLM is called.
after_agent : store the new conversation turn into MemOS.
Helper class (simplified):
class MemosMemoryHelper:
def __init__(self, user_id: str, top_k: int = 5):
self._init_memos() # initialise Memory and MemCube
def search_memories(self, query: str) -> List[str]:
"""Retrieve memories matching the query"""
...
def add_conversation(self, user_message: str, assistant_message: str):
"""Add a turn to the memory (incremental)"""
...Middleware skeleton:
class MemosMiddleware(AgentMiddleware):
def __init__(self, user_id: str, top_k: int = 5, auto_memorize: bool = True):
self.memory_helper = MemosMemoryHelper(user_id, top_k)
...
def before_agent(self, state: AgentState, runtime: Runtime):
"""Agent start – retrieve memories"""
...
def wrap_model_call(self, request: ModelRequest, handler) -> ModelResponse:
"""Inject memories into the system prompt"""
...
def after_agent(self, state: AgentState, runtime: Runtime):
"""Store the turn into MemOS"""
...Creating the agent with the middleware:
agent = create_agent(
model="gpt-4o-mini",
tools=[tavily_search],
system_prompt="你是一个拥有长期记忆的智能助手...",
middleware=[memos_middleware], # inject memory middleware
)Effect tests show that after a conversation ends, the agent can retrieve the stored memories in a new session and use them to answer follow‑up questions without re‑invoking external tools.
Alternative tool‑based integration
Instead of middleware, memory operations can be exposed as LangChain tools ( add_memory, search_memory) so the agent decides when to call them. This gives more transparency but relies on the model’s tool‑calling ability.
Conclusion
MemOS adds graph‑structured, long‑term memory to LangChain agents, enabling non‑intrusive integration, asynchronous reorganisation, and richer reasoning over hierarchical memory graphs. This turns a stateless LLM into an AI assistant that can remember user preferences, past interactions, and build evolving knowledge over time.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
AI Large Model Application Practice
Focused on deep research and development of large-model applications. Authors of "RAG Application Development and Optimization Based on Large Models" and "MCP Principles Unveiled and Development Guide". Primarily B2B, with B2C as a supplement.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
