Building an AI Agent Orchestrator for 50 Daily Commits at $190/month
Independent developer Elvis built an OpenClaw‑based AI agent orchestration system that lets a Zoe orchestrator manage Codex, Claude Code, and Gemini agents to write code, open PRs, and perform cross‑review, achieving about 50 commits per day for roughly $190 a month while highlighting cost, hardware bottlenecks, and failure‑handling strategies.
Elvis built an AI‑agent orchestration system called OpenClaw. A high‑level orchestrator named Zoe receives business‑level requests, enriches them with context from an Obsidian knowledge base, and translates them into precise prompts for coding agents (Codex, Claude Code, Gemini). The system produces roughly 50 commits per day at a monthly cost of about $190.
Core Idea: AI manages AI
Zoe holds the global business context while the coding agents focus solely on code generation. Zoe accesses client data, meeting notes, and past decisions stored in Obsidian, then injects the relevant information into prompts for the agents. The agents do not need to know the client identity; they only receive the concrete coding task (e.g., “add a template system in src/types/template.ts”).
8‑Step Workflow
The developer interacts only with Zoe; Zoe decomposes the request, assigns tasks to isolated agents, and merges the results.
Agent Isolation
Each agent runs in its own git worktree and tmux session, allowing concurrent work on separate branches without interference.
git worktree add ../feat-custom-templates -b feat/custom-templates origin/main
cd ../feat-custom-templates && pnpm install
tmux new-session -d -s "codex-templates" "$HOME/.codex-agent/run-agent.sh templates gpt-5.3-codex high"Running the coding models:
codex --model gpt-5.3-codex -c "model_reasoning_effort=high" --dangerously-bypass-approvals-and-sandbox "Your prompt here"
claude --model claude-opus-4.5 --dangerously-skip-permissions -p "Your prompt here"Mid‑Process Correction
Because agents run inside tmux, corrective commands can be sent without restarting the process:
tmux send-keys -t codex-templates "Stop. Focus on the API layer first, not the UI." EnterTask Tracking
Active tasks are recorded in .clawdbot/active-tasks.json:
{
"id": "feat-custom-templates",
"tmuxSession": "codex-templates",
"agent": "codex",
"description": "Custom email templates for agency customer",
"repo": "medialyst",
"worktree": "feat-custom-templates",
"branch": "feat/custom-templates",
"startedAt": 1740268800000,
"status": "running",
"notifyOnComplete": true
}When a task finishes, the status is updated automatically:
{
"status": "done",
"pr": 341,
"completedAt": 1740275400000,
"checks": { "prCreated": true, "ciPassed": true, "claudeReviewPassed": true, "geminiReviewPassed": true },
"note": "All checks passed. Ready to merge."
}Automated Monitoring
A cron job runs .clawdbot/check-agents.sh every ten minutes. The script (which makes no AI calls) verifies that tmux sessions are alive, open PRs exist, CI passes via the gh CLI, and restarts agents up to three times on failure.
PR Completion Definition
A PR is considered “done” only when all eight criteria are satisfied:
PR created
Branch synchronized with main (no merge conflicts)
All CI checks (lint, type‑check, unit tests, E2E) pass
Codex review passes
Claude Code review passes
Gemini review passes
If UI changes are involved, the PR description includes a screenshot (CI fails otherwise)
Cross‑AI Review
Codex – strong on boundary cases, race conditions, logic errors; described as “most reliable, low false‑positive rate”.
Gemini Code Assist – focuses on security and scalability; noted as “free and useful”.
Claude Code – tends to give overly cautious suggestions; described as “basically useless unless marked critical”.
Human review is reduced to a 5‑10 minute Telegram notification after CI and all AI reviews have passed; a screenshot in the PR description allows merging without reading the code.
Ralph Loop V2: Intelligent Failure Handling
When a task fails, Zoe analyses the failure, rewrites the prompt with relevant business context, and retries. Example rewrites include limiting context to specific files, restating the exact customer request, or adding supplemental emails and business details. Successful prompt patterns are persisted (e.g., “for billing features, first provide type definitions”). Over time Zoe’s prompts become more accurate because it records which structures ship successfully.
Zoe also proactively creates tasks:
Scans Sentry logs each morning, generates agents to fix new bugs.
After meetings, extracts feature requests from notes and spawns agents to implement them.
At night, scans git logs to generate agents that update changelogs and client documentation.
Model Selection
Codex (GPT‑5.3) – primary engine (≈90% of work); used for backend logic, complex bugs, cross‑file refactoring.
Claude Code – fast; used for frontend tasks and git operations.
Gemini – design‑focused; generates HTML/CSS mockups which are then handed to Claude for implementation.
Zoe routes tasks automatically based on type: billing‑system bugs to Codex, UI tweaks to Claude, new dashboard designs to Gemini.
Cost and Bottlenecks
Claude Code: ≈ $100 / month
Codex: ≈ $90 / month
Starter tier can run for ≈ $20 / month
The primary bottleneck is local RAM. Each agent requires its own node_modules, TypeScript compilation, and test execution. On a 16 GB Mac Mini, 4–5 concurrent agents cause swapping; Elvis upgraded to a 128 GB Mac Studio M4 Max to eliminate the issue. This illustrates that in 2026 the limiting factor for AI‑augmented productivity is often hardware rather than model capability.
Production Use Case
Elvis runs the system for Medialyst, a B2B SaaS that offers AI‑generated public‑relations content to startups, replacing a $10 k / month PR agency. The business model relies on same‑day delivery of customer requests: a request received in the morning can be shipped by the afternoon. The workflow yields about 50 commits per day and merges seven PRs within 30 minutes with minimal manual intervention.
References
Elvis’ original X post: https://x.com/elvissun/status/2025920521871716562
OpenClaw documentation: https://docs.openclaw.ai
OpenClaw GitHub repository: https://github.com/openclaw/openclaw
ShiZhen AI
Tech blogger with over 10 years of experience at leading tech firms, AI efficiency and delivery expert focusing on AI productivity. Covers tech gadgets, AI-driven efficiency, and leisure— AI leisure community. 🛰 szzdzhp001
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
