How to Build a Production‑Ready AI Memory System with Mem0 and Elasticsearch
This guide explains how to overcome the stateless nature of large language models by using the Mem0 framework together with Elasticsearch to create a persistent, vector‑searchable memory layer, covering architecture, real‑world scenarios, step‑by‑step deployment, and integration with the OpenClaw agent framework.
Applicable Architecture
Large language models (LLMs) are stateless; each request is processed independently without cross‑session memory. Combining the Mem0 framework for memory lifecycle management with Elasticsearch’s vector store enables a production‑grade AI memory system that provides persistent storage, semantic search, and intelligent updates.
Target Use Cases
Long Interaction Handling: Prevent context loss in extended conversations.
Cross‑Session Context Preservation: Remember user history and preferences across sessions.
Memory Persistence: Structure and store key information from dialogues.
Multi‑Agent Collaboration: Share a common memory store among multiple agents.
Solution Architecture
Memory Processing Pipeline
Fact Extraction: Call the LLM to extract factual statements from the input.
Vectorization: Use an embedding model to convert the text into vectors so that semantically similar memories are close in vector space.
Memory Retrieval: Perform a Top‑K similarity search in Elasticsearch to fetch the most relevant memory fragments.
Conflict Judgment: Let the LLM decide whether to update, merge, ignore, or create a new memory entry.
Write Execution: Persist the updated memory back to Elasticsearch.
An optional reranker can be added to re‑rank retrieved results for higher recall precision.
Integration Example – Adding Memory to OpenClaw
OpenClaw is an open‑source AI agent framework that combines LLMs with OS, web, and file capabilities. Its native memory is limited by context length, retrieval efficiency, and lack of cross‑session continuity. Integrating Mem0 + Elasticsearch resolves these limitations.
Step 1: Prepare Elasticsearch
Create an Elasticsearch instance (e.g., via Alibaba Cloud), set a password, configure the Kibana public‑network whitelist, and create an index named mem0:
PUT /mem0
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 1
}
}Step 2: Deploy Mem0 Server
Install Mem0 and Flask: pip install mem0ai flask Create a directory for the server and add server.py with the following content (replace placeholders with actual values):
from mem0 import Memory
from flask import Flask, request, jsonify
app = Flask(__name__)
config = {
"llm": {
"provider": "openai",
"config": {
"model": "qwen-plus",
"api_key": "$API_KEY",
"openai_base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1"
}
},
"embedder": {
"provider": "openai",
"config": {
"model": "text-embedding-v4",
"api_key": "$API_KEY",
"openai_base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1"
}
},
"vector_store": {
"provider": "elasticsearch",
"config": {
"host": "$ELASTICSEARCH_HOST",
"port": "$ELASTICSEARCH_PORT",
"user": "$ELASTICSEARCH_USER",
"password": "$ELASTICSEARCH_PASSWORD",
"collection_name": "mem0"
}
}
}
memory = Memory.from_config(config)
@app.route('/v1/memories', methods=['POST'])
def add_memory():
data = request.json
result = memory.add(messages=data['messages'], user_id=data['user_id'])
return jsonify(result)
@app.route('/v2/memories/search', methods=['POST'])
def search_memories():
data = request.json
result = memory.search(query=data['query'], user_id=data['user_id'])
return jsonify(result)
@app.route('/v1/memories', methods=['DELETE'])
def delete_memories():
user_id = request.args.get('user_id')
memory.delete_all(user_id=user_id)
return jsonify({"status": "success"})
if __name__ == '__main__':
app.run(host='0.0.0.0', port=8420)Step 3: Configure OpenClaw Skill
Create a skill directory ~/.openclaw/workspace/skills/agentic-memory-es and add three files: manifest.json – defines metadata and API schema. handler.py – implements HTTP calls to the Mem0 server. SKILL.md – documents usage.
manifest.json (excerpt):
{
"name": "agentic memory",
"id": "agentic-memory-es",
"version": "1.0.0",
"description_for_model": "Memory platform based on Mem0 + Elasticsearch. Supports add, search, delete_by_run_id, delete_by_user_id.",
"description_for_human": "Elasticsearch‑driven agent memory platform.",
"auth": {"type": "token", "token_header": "Authorization", "token_prefix": "Token"},
"api": {
"type": "python",
"main_file": "handler.py",
"functions": [
{"name": "add", "description": "Persist facts or preferences from a conversation.", "parameters": {"type": "object", "properties": {"user_id": {"type": "string"}, "context": {"type": "string"}}, "required": ["user_id", "context"]}},
{"name": "search", "description": "Retrieve user‑level historical memory across sessions.", "parameters": {"type": "object", "properties": {"user_id": {"type": "string"}, "query": {"type": "string"}}, "required": ["user_id", "query"]}},
{"name": "delete_by_run_id", "description": "Clear memory for a specific run id.", "parameters": {"type": "object", "properties": {"run_id": {"type": "string"}}, "required": ["run_id"]}},
{"name": "delete_by_user_id", "description": "Clear all memory for a user id.", "parameters": {"type": "object", "properties": {"user_id": {"type": "string"}}, "required": ["user_id"]}}
]
}
}handler.py (excerpt):
import json, subprocess
HOST = "$Mem0_HOST"
def _run_safe_curl(url, payload, method='POST'):
data = json.dumps(payload) if payload is not None else ""
cmd = ["curl", "-s", "-X", method, url, "-H", "Content-Type: application/json", "--data-binary", "@-", "--max-time", "15", "--no-buffer"]
result = subprocess.run(cmd, input=data, capture_output=True, text=True, check=True)
out = result.stdout.strip()
return json.loads(out) if out else {"status": "success"}
def add(user_id, context):
url = f"{HOST}/v1/memories"
payload = {"messages": [{"role": "user", "content": context}], "user_id": str(user_id)}
return _run_safe_curl(url, payload, method='POST')
def search(user_id, query):
url = f"{HOST}/v2/memories/search"
payload = {"query": query, "user_id": str(user_id)}
return _run_safe_curl(url, payload, method='POST')
def delete_by_run_id(run_id):
url = f"{HOST}/v1/memories?run_id={run_id}"
return _run_safe_curl(url, None, method='DELETE')
def delete_by_user_id(user_id):
url = f"{HOST}/v1/memories?user_id={user_id}"
return _run_safe_curl(url, None, method='DELETE')SKILL.md outlines the four operations (add, search, delete_by_run_id, delete_by_user_id) and their parameter schemas.
Step 4: Refresh Skills / Restart OpenClaw Gateway
After deploying the Mem0 server and adding the skill files, reload the skill catalog or restart the OpenClaw gateway to make the new memory capability available.
Step 5: Verify the Integration
Test memory write and retrieval via the OpenClaw console or API. Example commands:
# Write a memory
curl -X POST http://127.0.0.1:8420/v1/memories \
-H "Content-Type: application/json" \
-d '{"messages":[{"role":"user","content":"I prefer soft water for the dishwasher"}],"user_id":"user123"}'
# Search memory
curl -X POST http://127.0.0.1:8420/v2/memories/search \
-H "Content-Type: application/json" \
-d '{"query":"soft water","user_id":"user123"}'For local deployment, see the OpenClaw repository:
https://github.com/openclaw/openclaw
Relevant references:
Elasticsearch quick‑start guide: https://help.aliyun.com/zh/es/user-guide/getting-started
Kibana public‑network whitelist configuration: https://help.aliyun.com/zh/es/user-guide/configure-kibana-public-network-or-private-network
Mem0 project: https://github.com/mem0ai/mem0
Alibaba Cloud Big Data AI Platform
The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
