Demystifying OpenClaw: Agents, RAG, Memory & Skills Explained
This article explains the OpenClaw AI agent framework, detailing how its core Agent follows an Observe‑Plan‑Act loop, how Memory uses SQLite for short‑ and long‑term storage, how RAG retrieves external knowledge, and how Skills replace MCP with modular tool workflows, plus security tips and deployment links.
Architecture Overview
OpenClaw is an open‑source AI agent framework that implements a four‑layer architecture:
Gateway layer : unified inbound/outbound communication (e.g., Telegram, Feishu, DingTalk).
Reasoning layer : connects to large language models and runs the Observe‑Plan‑Act loop.
Memory & State layer : persistent SQLite‑backed memory.
Skills & Execution layer : invokes Skills to perform concrete actions.
Layer 1 – Agent (Core Brain)
The Agent is an autonomous system that perceives the environment, decides what to do, and executes actions. It follows the classic Observe‑Plan‑Act cycle:
Observe : understand user intent and current state.
Plan : decompose the task and select tools.
Act : invoke the chosen tool(s) and collect results.
Loop : repeat until the task is completed.
Each Agent works in its own workspace containing configuration files that make the system transparent and auditable:
AGENTS.md # Agent responsibilities and tool permissions
SOUL.md # Personalized system prompt
TOOLS.md # Whitelist/blacklist of tools (security boundary)
IDENTITY.md # Identity for different chat channels
USER.md # User preferences and context priors
MEMORY.md # Persistent memory documents (RAG source)Layer 2 – Memory System
Memory provides persistent state for the otherwise stateless LLM inference. It consists of:
Short‑term memory : raw recent conversation turns.
Long‑term memory : a background mini‑model compresses history into summaries and extracts entity features (e.g., “user is a programmer in Shanghai”), storing them in a SQLite database.
Retrieval is implemented with SQLite (using the sqlite‑vec extension when available) and falls back to pure JavaScript vector similarity when the extension is missing:
// Pseudo‑code for memory retrieval
async function searchMemory(queryVector, limit = 5) {
try {
// Fast path: native vector search via sqlite‑vec
return await db.all(`
SELECT c.text, vec_distance_cosine(v.embedding, ?) AS dist
FROM chunks_vec v
JOIN chunks c ON c.id = v.id
ORDER BY dist ASC LIMIT ?
`, [queryVector, limit]);
} catch (err) {
// Safe path: compute distances in JS
const allChunks = await db.all("SELECT text, embedding FROM chunks");
return allChunks
.map(chunk => ({ ...chunk, dist: cosineSimilarity(queryVector, JSON.parse(chunk.embedding)) }))
.sort((a, b) => a.dist - b.dist)
.slice(0, limit);
}
}Layer 3 – Retrieval‑Augmented Generation (RAG)
RAG solves the “knowledge freeze” problem by fetching up‑to‑date information before generation. The workflow is:
User asks a question.
The system searches a local SQLite‑based vector store for relevant chunks.
The retrieved snippets are combined with the original query.
The LLM generates an answer based on both.
Layer 4 – Tool Layer
Function Call
Function Call lets the model output a structured request that the developer executes. Example:
{
"function": "get_weather",
"parameters": {"city": "北京"}
}The developer runs the real API, obtains the data, and the model replies with a natural language answer.
Model Context Protocol (MCP)
Although MCP standardises tool‑calling interfaces, OpenClaw deliberately does not support it. The design choice is motivated by three factors:
Security & privacy – multi‑model collaboration could leak data.
Technical flexibility – avoiding lock‑in to a fixed protocol.
Resource optimisation – fewer dependencies and faster responses.
OpenClaw uses its own lightweight Skills mechanism instead.
Layer 5 – Skills (Process Layer)
Skills encapsulate complete workflows rather than single tools, answering “when, in what order, and how to combine tools”. Built‑in skills include:
memory : persists user preferences and history.
web_search : performs internet searches for real‑time data.
browser : opens pages and extracts content.
file : creates, reads and modifies files.
Installation is performed via the ClawHub CLI:
clawhub install memory # install the memory skill
clawhub install browser # install the browser control skillThe ClawHub skill registry hosts thousands of plugins covering office automation, code management, data processing, etc.
Typical Execution Flow
User sends a message (e.g., “Help me organise my desktop files”).
Gateway forwards the request to the Agent.
Agent analyses the task and decides to call the file skill.
The skill performs the file operation and returns the result.
Memory records the operation for future reference.
Agent replies, “Desktop organised – images moved to Pictures folder”.
Security Recommendations
Do not run OpenClaw on your primary machine; use a VM, an old PC, or a dedicated system account.
Never expose API keys; keep them out of Git repositories and screenshots.
For long‑running services, deploy inside Docker or a VPS to isolate permissions.
Assume the agent has the same filesystem permissions as the user; never treat it as a simple file‑transfer tool.
Repository
Source code and quick‑start guides are available at:
GitHub: https://github.com/openclaw
Java Tech Enthusiast
Sharing computer programming language knowledge, focusing on Java fundamentals, data structures, related tools, Spring Cloud, IntelliJ IDEA... Book giveaways, red‑packet rewards and other perks await!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
