Building an AI Agent Orchestrator for 50 Daily Commits at $190/month

Independent developer Elvis built an OpenClaw‑based AI agent orchestration system that lets a Zoe orchestrator manage Codex, Claude Code, and Gemini agents to write code, open PRs, and perform cross‑review, achieving about 50 commits per day for roughly $190 a month while highlighting cost, hardware bottlenecks, and failure‑handling strategies.

ShiZhen AI
ShiZhen AI
ShiZhen AI
Building an AI Agent Orchestrator for 50 Daily Commits at $190/month

Elvis built an AI‑agent orchestration system called OpenClaw. A high‑level orchestrator named Zoe receives business‑level requests, enriches them with context from an Obsidian knowledge base, and translates them into precise prompts for coding agents (Codex, Claude Code, Gemini). The system produces roughly 50 commits per day at a monthly cost of about $190.

Core Idea: AI manages AI

Zoe holds the global business context while the coding agents focus solely on code generation. Zoe accesses client data, meeting notes, and past decisions stored in Obsidian, then injects the relevant information into prompts for the agents. The agents do not need to know the client identity; they only receive the concrete coding task (e.g., “add a template system in src/types/template.ts”).

8‑Step Workflow

The developer interacts only with Zoe; Zoe decomposes the request, assigns tasks to isolated agents, and merges the results.

Agent Isolation

Each agent runs in its own git worktree and tmux session, allowing concurrent work on separate branches without interference.

git worktree add ../feat-custom-templates -b feat/custom-templates origin/main
cd ../feat-custom-templates && pnpm install
tmux new-session -d -s "codex-templates" "$HOME/.codex-agent/run-agent.sh templates gpt-5.3-codex high"

Running the coding models:

codex --model gpt-5.3-codex -c "model_reasoning_effort=high" --dangerously-bypass-approvals-and-sandbox "Your prompt here"
claude --model claude-opus-4.5 --dangerously-skip-permissions -p "Your prompt here"

Mid‑Process Correction

Because agents run inside tmux, corrective commands can be sent without restarting the process:

tmux send-keys -t codex-templates "Stop. Focus on the API layer first, not the UI." Enter

Task Tracking

Active tasks are recorded in .clawdbot/active-tasks.json:

{
  "id": "feat-custom-templates",
  "tmuxSession": "codex-templates",
  "agent": "codex",
  "description": "Custom email templates for agency customer",
  "repo": "medialyst",
  "worktree": "feat-custom-templates",
  "branch": "feat/custom-templates",
  "startedAt": 1740268800000,
  "status": "running",
  "notifyOnComplete": true
}

When a task finishes, the status is updated automatically:

{
  "status": "done",
  "pr": 341,
  "completedAt": 1740275400000,
  "checks": { "prCreated": true, "ciPassed": true, "claudeReviewPassed": true, "geminiReviewPassed": true },
  "note": "All checks passed. Ready to merge."
}

Automated Monitoring

A cron job runs .clawdbot/check-agents.sh every ten minutes. The script (which makes no AI calls) verifies that tmux sessions are alive, open PRs exist, CI passes via the gh CLI, and restarts agents up to three times on failure.

PR Completion Definition

A PR is considered “done” only when all eight criteria are satisfied:

PR created

Branch synchronized with main (no merge conflicts)

All CI checks (lint, type‑check, unit tests, E2E) pass

Codex review passes

Claude Code review passes

Gemini review passes

If UI changes are involved, the PR description includes a screenshot (CI fails otherwise)

Cross‑AI Review

Codex – strong on boundary cases, race conditions, logic errors; described as “most reliable, low false‑positive rate”.

Gemini Code Assist – focuses on security and scalability; noted as “free and useful”.

Claude Code – tends to give overly cautious suggestions; described as “basically useless unless marked critical”.

Human review is reduced to a 5‑10 minute Telegram notification after CI and all AI reviews have passed; a screenshot in the PR description allows merging without reading the code.

Ralph Loop V2: Intelligent Failure Handling

When a task fails, Zoe analyses the failure, rewrites the prompt with relevant business context, and retries. Example rewrites include limiting context to specific files, restating the exact customer request, or adding supplemental emails and business details. Successful prompt patterns are persisted (e.g., “for billing features, first provide type definitions”). Over time Zoe’s prompts become more accurate because it records which structures ship successfully.

Zoe also proactively creates tasks:

Scans Sentry logs each morning, generates agents to fix new bugs.

After meetings, extracts feature requests from notes and spawns agents to implement them.

At night, scans git logs to generate agents that update changelogs and client documentation.

Model Selection

Codex (GPT‑5.3) – primary engine (≈90% of work); used for backend logic, complex bugs, cross‑file refactoring.

Claude Code – fast; used for frontend tasks and git operations.

Gemini – design‑focused; generates HTML/CSS mockups which are then handed to Claude for implementation.

Zoe routes tasks automatically based on type: billing‑system bugs to Codex, UI tweaks to Claude, new dashboard designs to Gemini.

Cost and Bottlenecks

Claude Code: ≈ $100 / month

Codex: ≈ $90 / month

Starter tier can run for ≈ $20 / month

The primary bottleneck is local RAM. Each agent requires its own node_modules, TypeScript compilation, and test execution. On a 16 GB Mac Mini, 4–5 concurrent agents cause swapping; Elvis upgraded to a 128 GB Mac Studio M4 Max to eliminate the issue. This illustrates that in 2026 the limiting factor for AI‑augmented productivity is often hardware rather than model capability.

Production Use Case

Elvis runs the system for Medialyst, a B2B SaaS that offers AI‑generated public‑relations content to startups, replacing a $10 k / month PR agency. The business model relies on same‑day delivery of customer requests: a request received in the morning can be shipped by the afternoon. The workflow yields about 50 commits per day and merges seven PRs within 30 minutes with minimal manual intervention.

References

Elvis’ original X post: https://x.com/elvissun/status/2025920521871716562

OpenClaw documentation: https://docs.openclaw.ai

OpenClaw GitHub repository: https://github.com/openclaw/openclaw

automationAI agentssoftware developmentcost optimizationCodexClaude CodeOpenClaw
ShiZhen AI
Written by

ShiZhen AI

Tech blogger with over 10 years of experience at leading tech firms, AI efficiency and delivery expert focusing on AI productivity. Covers tech gadgets, AI-driven efficiency, and leisure— AI leisure community. 🛰 szzdzhp001

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.