How Hermes-Agent Enables Self‑Learning Skills for Autonomous AI Agents
Hermes‑Agent introduces a novel self‑learning Skill system that lets AI agents automatically capture, refine, and patch reusable knowledge from complex tasks, using a dual front‑end awareness and back‑end inspection loop, reinforced by safety guards and a reinforcement‑learning training pipeline.
Overview
Hermes‑Agent is a next‑generation AI agent framework that goes beyond static skill libraries. Its core philosophy is to shift from merely expanding the breadth of skills to enabling agents to learn, evolve, and improve themselves through practical experience, achieving "getting stronger with use".
Part 1: Why Self‑Learning Agents?
Modern AI agents have progressed from passive chatbots to tool‑calling agents and now to systems with persistent memory. However, most still lack the ability to autonomously extract and codify the tacit knowledge gained during real‑world problem solving—what humans call "muscle memory" or standard operating procedures (SOPs). Hermes‑Agent addresses this gap by allowing agents to write their own skill manuals after successful task completion.
Part 2: Demonstration – Hermes‑Agent Self‑Learning a Skill
Installation & Configuration Hermes‑Agent simplifies setup compared with OpenClaw. In minimal mode, three steps are sufficient: one‑click install, model configuration, and executing the hermes command (see the GitHub documentation for details).
The agent also supports direct access to common models via API keys and can borrow subscriptions from tools such as GitHub Copilot or Codex without extra configuration.
Designing a Trigger Task To provoke self‑learning, a task must satisfy three conditions:
The task cannot be solved with existing skills.
It requires a relatively high number of tool calls (e.g., more than five).
It is likely to be repeated in the future.
An example multi‑step code‑audit task is provided:
对 ~/hermes-agent 源代码做一次代码库质量检查,要求完成:</code><code>1. 统计 .py 文件总数和总行数,找出最大的5个文件</code><code>2. 搜索 TODO/FIXME/HACK 注释,按子目录分组统计数量</code><code>3. 读取最大文件的头20行和尾20行,并逐条读取它的 TODO 注释上下文</code><code>4. 搜索非测试文件中的 password=、secret=、api_key= 行,判断是占位符还是硬编码</code><code>5. 在 tools/ 下找被整个项目 import ≤1 次的 "孤儿" 模块</code><code>6. 找被引用次数最多的5个模块,读取第1名的前50行</code><code>7. 把完整报告写入 /tmp/source_checkreport_$(date +%Y%m%d).md,包含各步骤数据汇总表格和至少8条按高/中/低优先级排列的技术债清单</code><code>最后告诉我,下次做类似审计时,你会从哪一步先入手。When this task is submitted to Hermes‑Agent’s TUI, the agent runs the steps, generates a detailed report, and automatically creates a new Skill named software-development. The new Skill appears in ~/.hermes/skills/ as a SKILL.md file containing the exact audit procedure.
Part 3: Full Dissection of the Skill Mechanism
Skill Self‑Learning Awareness
During agent startup (see run_agent.py), a system‑prompt pipeline injects instructions that embed the following awareness:
SKILLS_GUIDANCE = (
"当你完成一个复杂任务(例如需要多次调用工具,通常为 5 次及以上)、"
"解决了一个棘手错误,或发现了一套非显而易见但可复用的工作流程时,"
"请使用 skill_manage 将该方法保存为一个技能(skill),"
"以便下次在类似场景中复用。
"
"当你在使用某个技能时,如果发现它已经过时、不完整或存在错误,"
"请立即使用 skill_manage(action='patch') 对其进行修补,"
"不要等到被要求时才处理。"
"如果技能得不到维护,它最终就会从资产变成负担。"
)This prompt ensures the agent knows when to create, update, or patch a Skill.
Full Chain & Trigger Timing
Hermes‑Agent employs a dual mechanism:
Front‑end Self‑Awareness : While executing a complex task, the model may proactively invoke skill_manage to create a Skill, resetting the internal counter _iters_since_skill to zero.
Back‑end Inspection : If no Skill is created during the task, a counter accumulates the number of tool calls across tasks. When the counter reaches a configurable threshold (default 10), an asynchronous review agent runs, analyses the dialogue and tool‑call history, and decides whether to synthesize a new Skill.
The back‑end review uses the following prompt:
_SKILL_REVIEW_PROMPT = (
"回顾上面的对话内容,判断是否有必要保存或更新某个技能。
"
"重点关注:在完成任务的过程中,是否采用了非显而易见的方法,"
"是否经历了反复试错,或在实践过程中因为经验反馈而调整了方向,"
"或者用户是否期望或需要一种不同的方法或结果?
"
"如果已经存在相关技能,请基于当前经验对其进行更新;"
"如果不存在且该方法具备可复用性,则创建一个新的技能。
"
"如果没有值得保存的内容,请直接输出:'Nothing to save.' 并停止。"
)Only experiences that involve genuine trial‑and‑error, workaround discovery, or significant process adjustments are turned into Skills.
Automatic Patch Mechanism
When a Skill itself becomes faulty, the agent can invoke skill_manage(action='patch') to replace erroneous strings in the Skill’s definition files. Example:
skill_manage(
action="patch",
old_string="https://old-registry.xx.xx",
new_string="https://registry.xx.xx"
)This performs an in‑place string replacement, effectively “patching” the Skill without interrupting the ongoing workflow.
Conditional Activation & Safety Guard
As the number of Skills and Tools grows, Hermes‑Agent filters the Skill index to avoid redundancy and unsafe usage. The filtering rules include:
If a primary tool is available, its backup Skill is hidden.
If a Skill’s prerequisite tool is missing, the Skill is not loaded.
Additionally, each Skill undergoes a security scan that checks for:
Hard‑coded secrets (API keys, passwords).
Suspicious code execution patterns (potential backdoors).
Prompt injection attempts.
Dangerous commands (e.g., rm -rf, chmod 777).
Based on the detected risk level (safe, caution, dangerous) and the Skill’s provenance (built‑in, official, community), Hermes‑Agent decides whether to allow the Skill.
Summary of Mechanisms
The table below (converted to text) summarizes the two mechanisms:
Front‑end Self‑Awareness – Triggered during task execution when the model decides to call skill_manage. Immediate Skill creation; counter reset.
Back‑end Inspection – Triggered after task completion when accumulated tool calls exceed the threshold (default ≥10). An asynchronous review agent evaluates the history and may create a Skill; counter then resets.
Together, these mechanisms enable agents to autonomously capture valuable knowledge, automatically repair outdated Skills, and enforce safety constraints, paving the way for production‑grade, self‑evolving AI agents.
Future Work
The next article in the series will explore Hermes‑Agent’s reinforcement‑learning training loop, which closes the loop by turning agent trajectories into training data for continual model improvement.
AI Large Model Application Practice
Focused on deep research and development of large-model applications. Authors of "RAG Application Development and Optimization Based on Large Models" and "MCP Principles Unveiled and Development Guide". Primarily B2B, with B2C as a supplement.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
