How to Cut Token Costs When Using OpenClaw Agents

This guide shares practical ways to reduce token consumption in OpenClaw by monitoring agent actions, stopping runaway tasks, trimming oversized markdown configurations, applying concise agent rules, and leveraging free models for testing, helping users halve their AI expenses.

Black & White Path
Black & White Path
Black & White Path
How to Cut Token Costs When Using OpenClaw Agents

OpenClaw’s token usage can quickly become a cost issue, especially when agents drift into error handling or endless retries; the author shares personal experiences and concrete steps to keep token spending under control.

Experience 1: Monitor What the Agent Is Doing

By opening the local UI at http://127.0.0.1:18789/ you can see the agent’s current activity. If the agent does not reply for a long time, check this page to understand why tokens are being spent.

When an agent gets stuck—e.g., trying to send a file but repeatedly failing—you can intervene with the /stop command in the chat window to abort its thinking. If there is still no response, restart the gateway with: bash openclaw gateway restart These actions prevent tokens from being wasted on endless retries or “neural glitches” that produce no useful output.

Experience 2: Trim the Markdown Configuration Files

The default markdown files (USER.md, TOOLS.md, SOUL.md, IDENTITY.md, HEARTBEAT.md, BOOTSTRAP.md, AGENTS.md) each exceed 2 000 tokens. Deleting redundant sections and keeping only essential lines can dramatically lower token consumption; the author recommends writing them in English for better compatibility with English‑language models.

USER.md: stores personal info, preferences, common commands.
TOOLS.md: defines external tools (search, code executor, APIs) and their parameters.
SOUL.md: holds long‑term memory and core knowledge.
IDENTITY.md: sets the agent’s role, personality, and response style.
HEARTBEAT.md: configures health checks and periodic tasks.
BOOTSTRAP.md: contains startup settings and load order.
AGENTS.md: lists specialized agents and their responsibilities.

After trimming, the author reduced a monitoring agent from several thousand tokens to just 700 tokens, enough for its purpose.

Agent Rules Example (AGENTS.md)

# AGENTS.md -
## 通用规则(Boss Commands)
1. **会话长度提醒** — 与 Boss 对话超过 50 句,提醒开新会话
2. **语言切换** — 与 Agent 对话用英文,与主人(Boss)对话用中文
3. **简洁回复** — 工作时回 Boss 话,说重点,尽量简短

These concise rules prevent agents from sending long, unnecessary replies that waste tokens.

Monitoring Token Usage

Run the provided inspector.sh script to see each agent’s token consumption, e.g.:

$ ./inspector.sh
Agent: workspace-Emailadmin | Token: 2042 | Status: Normal
Agent: workspace-coder1 | Token: 1889 | Status: Normal
…

By comparing token usage before and after optimizations (e.g., monitoring every five minutes for a day), you can quantify cost savings.

Using Free Models for Debugging

For early development, the author suggests testing with free models such as nvidia/moonshotai/kimi‑k2.5, zai/GLM‑4.7‑Flash, qwen‑portal/coder‑model, and various Gemini‑flash variants, which have generous context windows and no authentication fees.

Applying these practices—monitoring agent activity, stopping stalled tasks, trimming markdown files, enforcing concise agent rules, and testing with free models—can reduce token expenses by roughly half, with further advanced techniques to be covered in future posts.

AI agentsToken OptimizationOpenClawagent rulesCost Savingfree AI modelsmarkdown trimming
Black & White Path
Written by

Black & White Path

We are the beacon of the cyber world, a stepping stone on the road to security.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.