Understanding OpenClaw: Inside the AI Agent Framework Explained by Prof. Li Hongyi

In this detailed lecture, Prof. Li Hongyi of National Taiwan University dissects the OpenClaw AI Agent, explaining its system prompts, tool usage, memory handling, sub‑agents, security risks like prompt injection, and practical safeguards for deploying autonomous agents on personal computers.

Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Understanding OpenClaw: Inside the AI Agent Framework Explained by Prof. Li Hongyi

OpenClaw is an AI Agent that sits between a user (via messaging apps such as WhatsApp, LINE, Discord) and a large language model (LLM) like Claude, GPT, or Gemini. When a user sends a command, OpenClaw first augments the message with a system prompt that defines the agent’s identity, goals, and available tools, then appends the full conversation history before forwarding the request to the LLM.

The system prompt typically contains statements such as:

你是一个 AI Agent
你的名字是小金
你的目标是协助用户完成任务
你可以使用以下工具……
你要主动思考如何完成目标

and the previous dialogue is concatenated as 系统提示 + 过去的对话历史 + 你现在说的话. This explains why the model appears to “remember” earlier messages – the entire context is sent each time.

OpenClaw can invoke external tools using a special token. For example, to read a file:

使用工具:read
参数:question.txt

and to write a result:

使用工具:write
参数:answer.txt
内容:大金

When the LLM decides to use a tool, it returns a response prefixed with the 使用工具 marker, and OpenClaw executes the corresponding shell command on the host machine. This capability makes the agent powerful but also introduces security concerns.

Security risks include Prompt Injection , where malicious input (e.g., hidden white‑text commands) can cause the agent to run dangerous commands such as rm -rf /. OpenClaw mitigates this by offering an optional approval step for every tool execution, which must be explicitly confirmed by the human operator.

Another risk is the distribution of malicious Skill files. A Skill is a plain‑text SOP (e.g., a script for creating a video) stored as skill.md. If a Skill instructs the agent to download a password‑protected zip and execute it, the agent will obey, potentially installing malware. Scanning services like Coin Security have found hundreds of such malicious Skills in public repositories.

OpenClaw’s memory is kept in simple markdown files: soul.md (identity), memory.md (long‑term memory), habit.md (periodic tasks), and system.txt (tool definitions). The agent reads these files on demand; for instance, it looks for skill.md files, extracts the description, and adds a reference to the system prompt without loading the full content, preserving the context window.

To enable autonomous behavior, OpenClaw implements a heartbeat mechanism that periodically (e.g., every 15 minutes) asks the LLM “What should I do now?” based on a habit.md entry. It also supports Cron Jobs for scheduled actions, allowing the agent to wait for external events (e.g., waiting for a video to finish rendering) by creating a future task.

Because the conversation can grow beyond the LLM’s context window, OpenClaw uses context compaction (RAG). It chunks memory.md, computes lexical and embedding similarity scores, selects the most relevant chunks, and includes only those in the system prompt. Older history can be summarized via the LLM (soft trim) or replaced with a placeholder like “previous tool output omitted” (hard clear) to keep token usage low.

OpenClaw also supports Subagents (or Spawn). For a large task such as summarizing ten articles, the main agent can spawn ten sub‑agents, each handling one article and returning a concise summary. The parent agent then aggregates the results, dramatically reducing context consumption.

Practical safety recommendations include:

Run OpenClaw on a dedicated machine separate from daily work devices.

Configure the approval‑before‑tool flag to require human confirmation for every action.

Avoid sharing personal credentials; give the agent its own accounts (e.g., a separate Gmail or GitHub repo).

Regularly audit .md files for unintended changes.

Prefer explicit Skill files stored locally and verify their content before use.

When used responsibly, OpenClaw demonstrates how autonomous AI Agents can combine LLM reasoning with real‑world tool execution, offering a glimpse of the next generation of personal assistants while highlighting the importance of robust security and context management.

securityAI Agentprompt injectiontool useContext EngineeringOpenClawsubagent
Machine Learning Algorithms & Natural Language Processing
Written by

Machine Learning Algorithms & Natural Language Processing

Focused on frontier AI technologies, empowering AI researchers' progress.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.