Why AI Agents Forget Their Work and How a Harness Can Fix It
The article analyzes why AI agents lose context during multi‑session web‑app development, outlines common failure patterns, and proposes a practical harness that records progress, uses Git commits, and enforces fine‑grained feature lists and end‑to‑end testing to keep development on track.
Problem: Agent Forgetting (Memory Loss)
When an AI agent is asked to build a complete web application from scratch, it reads the requirements, writes code, and creates files, but after a few hours the project appears half‑finished and inconsistent because the agent "forgets" its previous work.
Root Causes
The agent operates within a limited "context window" that can be imagined as a whiteboard. Once the board is full, the agent must erase some content to continue. More critically, each new session starts with a brand‑new empty board, so all previously written code, decisions, and discovered issues are lost.
Typical Failure Modes
One‑bite‑fits‑all : The agent tries to implement the entire application in one go, exhausting the context window and leaving half‑implemented UI and backend components with many TODO comments.
Premature completion : Near the end of a project the agent declares the project finished even though many features remain untouched.
Solution: Progress File and Git Harness
Persist key information between sessions by writing a progress file before the agent’s context is cleared. The file records what has been completed, the current state, known issues, and next steps. Additionally, each change is committed to Git with a clear commit message.
# progress.txt
## 已完成
- 搭建了项目脚手架(React + Express)
- 完成了用户登录功能
- 完成了聊天界面的基础布局
## 当前状态
- 正在开发消息发送功能,已完成前端部分,后端API待实现
## 已知问题
- 登录页面在移动端有样式错位
## 下一步
- 完成消息发送的后端API
- 对接WebSocket实现实时推送Feature List and Incremental Development
Define a fine‑grained, verifiable feature list where each item includes a description, ordered steps, and a passes flag indicating completion. The agent is instructed to work on the highest‑priority item whose passes flag is false.
{
"description": "用户点击新建聊天按钮,创建一个空白对话",
"steps": [
"打开应用主界面",
"点击新建聊天按钮",
"验证新对话被创建",
"验证聊天区域显示欢迎页面",
"验证侧边栏出现新对话"
],
"passes": false
}Typical Agent Workflow
1. 打开进度文件,了解项目当前状态
2. 查看git记录,了解最近的改动
3. 启动开发服务器
4. 用浏览器跑一遍基础功能测试
- 如果有bug,先修bug
- 如果一切正常,继续
5. 从功能清单中选择下一个待完成的功能
6. 实现这个功能
7. 用浏览器端到端验证
8. 提交代码,更新进度文件Testing and Validation
Before starting a new feature, the agent runs a set of end‑to‑end browser tests that simulate real user interactions (opening the app, clicking buttons, entering text, and checking rendering). This ensures that previously completed functionality remains stable.
Agent Harness Overview
The surrounding system that orchestrates these steps—recording progress, committing to Git, enforcing test‑first development, and managing the feature list—is referred to as an Agent Harness . By treating the harness as a disciplined workflow rather than a loose collection of prompts, the agent avoids memory loss, over‑ambitious one‑shot development, and false completion reports.
IT Services Circle
Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
