Artificial Intelligence 15 min read

How to Build a Production‑Ready AI Memory System with Mem0 and Elasticsearch

This guide explains how to overcome the stateless nature of large language models by using the Mem0 framework together with Elasticsearch to create a persistent, vector‑searchable memory layer, covering architecture, real‑world scenarios, step‑by‑step deployment, and integration with the OpenClaw agent framework.

Alibaba Cloud Big Data AI Platform

Mar 31, 2026

How to Build a Production‑Ready AI Memory System with Mem0 and Elasticsearch

Applicable Architecture

Large language models (LLMs) are stateless; each request is processed independently without cross‑session memory. Combining the Mem0 framework for memory lifecycle management with Elasticsearch’s vector store enables a production‑grade AI memory system that provides persistent storage, semantic search, and intelligent updates.

Target Use Cases

Long Interaction Handling: Prevent context loss in extended conversations.

Cross‑Session Context Preservation: Remember user history and preferences across sessions.

Memory Persistence: Structure and store key information from dialogues.

Multi‑Agent Collaboration: Share a common memory store among multiple agents.

Solution Architecture

Memory Processing Pipeline

Fact Extraction: Call the LLM to extract factual statements from the input.

Vectorization: Use an embedding model to convert the text into vectors so that semantically similar memories are close in vector space.

Memory Retrieval: Perform a Top‑K similarity search in Elasticsearch to fetch the most relevant memory fragments.

Conflict Judgment: Let the LLM decide whether to update, merge, ignore, or create a new memory entry.

Write Execution: Persist the updated memory back to Elasticsearch.

An optional reranker can be added to re‑rank retrieved results for higher recall precision.

Integration Example – Adding Memory to OpenClaw

OpenClaw is an open‑source AI agent framework that combines LLMs with OS, web, and file capabilities. Its native memory is limited by context length, retrieval efficiency, and lack of cross‑session continuity. Integrating Mem0 + Elasticsearch resolves these limitations.

Step 1: Prepare Elasticsearch

Create an Elasticsearch instance (e.g., via Alibaba Cloud), set a password, configure the Kibana public‑network whitelist, and create an index named mem0:

PUT /mem0
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 1
  }
}

Step 2: Deploy Mem0 Server

Install Mem0 and Flask: pip install mem0ai flask Create a directory for the server and add server.py with the following content (replace placeholders with actual values):

from mem0 import Memory
from flask import Flask, request, jsonify
app = Flask(__name__)
config = {
  "llm": {
    "provider": "openai",
    "config": {
      "model": "qwen-plus",
      "api_key": "$API_KEY",
      "openai_base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1"
    }
  },
  "embedder": {
    "provider": "openai",
    "config": {
      "model": "text-embedding-v4",
      "api_key": "$API_KEY",
      "openai_base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1"
    }
  },
  "vector_store": {
    "provider": "elasticsearch",
    "config": {
      "host": "$ELASTICSEARCH_HOST",
      "port": "$ELASTICSEARCH_PORT",
      "user": "$ELASTICSEARCH_USER",
      "password": "$ELASTICSEARCH_PASSWORD",
      "collection_name": "mem0"
    }
  }
}
memory = Memory.from_config(config)

@app.route('/v1/memories', methods=['POST'])
def add_memory():
    data = request.json
    result = memory.add(messages=data['messages'], user_id=data['user_id'])
    return jsonify(result)

@app.route('/v2/memories/search', methods=['POST'])
def search_memories():
    data = request.json
    result = memory.search(query=data['query'], user_id=data['user_id'])
    return jsonify(result)

@app.route('/v1/memories', methods=['DELETE'])
def delete_memories():
    user_id = request.args.get('user_id')
    memory.delete_all(user_id=user_id)
    return jsonify({"status": "success"})

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=8420)

Step 3: Configure OpenClaw Skill

Create a skill directory ~/.openclaw/workspace/skills/agentic-memory-es and add three files: manifest.json – defines metadata and API schema. handler.py – implements HTTP calls to the Mem0 server. SKILL.md – documents usage.

manifest.json (excerpt):

{
  "name": "agentic memory",
  "id": "agentic-memory-es",
  "version": "1.0.0",
  "description_for_model": "Memory platform based on Mem0 + Elasticsearch. Supports add, search, delete_by_run_id, delete_by_user_id.",
  "description_for_human": "Elasticsearch‑driven agent memory platform.",
  "auth": {"type": "token", "token_header": "Authorization", "token_prefix": "Token"},
  "api": {
    "type": "python",
    "main_file": "handler.py",
    "functions": [
      {"name": "add", "description": "Persist facts or preferences from a conversation.", "parameters": {"type": "object", "properties": {"user_id": {"type": "string"}, "context": {"type": "string"}}, "required": ["user_id", "context"]}},
      {"name": "search", "description": "Retrieve user‑level historical memory across sessions.", "parameters": {"type": "object", "properties": {"user_id": {"type": "string"}, "query": {"type": "string"}}, "required": ["user_id", "query"]}},
      {"name": "delete_by_run_id", "description": "Clear memory for a specific run id.", "parameters": {"type": "object", "properties": {"run_id": {"type": "string"}}, "required": ["run_id"]}},
      {"name": "delete_by_user_id", "description": "Clear all memory for a user id.", "parameters": {"type": "object", "properties": {"user_id": {"type": "string"}}, "required": ["user_id"]}}
    ]
  }
}

handler.py (excerpt):

import json, subprocess
HOST = "$Mem0_HOST"

def _run_safe_curl(url, payload, method='POST'):
    data = json.dumps(payload) if payload is not None else ""
    cmd = ["curl", "-s", "-X", method, url, "-H", "Content-Type: application/json", "--data-binary", "@-", "--max-time", "15", "--no-buffer"]
    result = subprocess.run(cmd, input=data, capture_output=True, text=True, check=True)
    out = result.stdout.strip()
    return json.loads(out) if out else {"status": "success"}

def add(user_id, context):
    url = f"{HOST}/v1/memories"
    payload = {"messages": [{"role": "user", "content": context}], "user_id": str(user_id)}
    return _run_safe_curl(url, payload, method='POST')

def search(user_id, query):
    url = f"{HOST}/v2/memories/search"
    payload = {"query": query, "user_id": str(user_id)}
    return _run_safe_curl(url, payload, method='POST')

def delete_by_run_id(run_id):
    url = f"{HOST}/v1/memories?run_id={run_id}"
    return _run_safe_curl(url, None, method='DELETE')

def delete_by_user_id(user_id):
    url = f"{HOST}/v1/memories?user_id={user_id}"
    return _run_safe_curl(url, None, method='DELETE')

SKILL.md outlines the four operations (add, search, delete_by_run_id, delete_by_user_id) and their parameter schemas.

Step 4: Refresh Skills / Restart OpenClaw Gateway

After deploying the Mem0 server and adding the skill files, reload the skill catalog or restart the OpenClaw gateway to make the new memory capability available.

Step 5: Verify the Integration

Test memory write and retrieval via the OpenClaw console or API. Example commands:

# Write a memory
curl -X POST http://127.0.0.1:8420/v1/memories \
  -H "Content-Type: application/json" \
  -d '{"messages":[{"role":"user","content":"I prefer soft water for the dishwasher"}],"user_id":"user123"}'

# Search memory
curl -X POST http://127.0.0.1:8420/v2/memories/search \
  -H "Content-Type: application/json" \
  -d '{"query":"soft water","user_id":"user123"}'

For local deployment, see the OpenClaw repository:

https://github.com/openclaw/openclaw

Relevant references:

Elasticsearch quick‑start guide: https://help.aliyun.com/zh/es/user-guide/getting-started

Kibana public‑network whitelist configuration: https://help.aliyun.com/zh/es/user-guide/configure-kibana-public-network-or-private-network

Mem0 project: https://github.com/mem0ai/mem0

LLM Elasticsearch Vector Store AI memory Mem0 OpenClaw

Written by

Alibaba Cloud Big Data AI Platform

The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.