Artificial Intelligence 25 min read

Inside the 18‑Day Evolution of an Open‑Source AI Agent Orchestrator

In just 18 days after open‑sourcing, the TypeScript‑based Agent Orchestrator built by AI agents themselves amassed 3,800+ GitHub stars, executed 15 parallel sessions, merged six PRs, faced authentication crashes and session deaths, and evolved through a three‑tier escalation protocol with OpenClaw integration to enable real‑time Telegram interaction and continuous self‑improvement.

High Availability Architecture

Mar 13, 2026

Inside the 18‑Day Evolution of an Open‑Source AI Agent Orchestrator

Agent Orchestrator Overview

Agent Orchestrator (AO) is a TypeScript‑based system that orchestrates multiple AI coding agents. Each agent runs in its own Git worktree, has an isolated tmux terminal session, and owns a single task. AO tracks all dynamics, monitors CI, creates pull requests, and restarts with the new version after a PR is merged.

Parallel Night Run

During a single overnight execution 15 agent sessions were launched. Six pull requests were successfully merged; the remaining sessions suffered authentication crashes, missing tmux processes, misuse of the exec tool instead of ao spawn, and duplicate work on the same bug.

Agents create a task, spawn a code agent with repository context, write code and tests, open a PR, and another agent reviews it.

If the PR passes, the code is merged and AO restarts with the updated version.

Key Failures

Authentication crashes caused by a hardening of the gh CLI that moved auth.json into a user‑only directory, preventing new Codex sessions from reading credentials.

Session ao‑9 appeared in AO’s session list without an actual tmux process, leading to endless reconnection attempts.

Using exec instead of ao spawn for sessions ao‑10 – ao‑14 caused silent termination.

Duplicate work when two agents ( ao‑6 and ao‑7) attempted to fix the same bug.

OpenClaw Integration

OpenClaw is a persistent, local‑first process capable of running shell commands, reading files, calling APIs, and handling events. The notifier plugin @composio/ao-plugin-notifier-openclaw connects AO to OpenClaw via a webhook at /hooks/agent. Communication occurs over a local 127.0.0.1 connection with token authentication, providing ambient availability through Telegram, natural‑language commands ( ao spawn, ao send), and cross‑session context.

Three‑Tier Escalation Protocol

Agent Self‑Healing : On CI failure the agent injects the error, patches the code, and retries up to five times (e.g., PR #354). This layer is fully implemented.

Orchestrator Mediation : If self‑healing fails, the orchestrator (itself a Codex instance) reviews the failure, suggests fixes, or redesigns the task. Designed but not yet implemented.

Human‑In‑The‑Loop via Telegram : Unresolved issues are reported to a developer through OpenClaw‑Telegram messages for manual intervention.

Connector Implementation

The released connector consists of a notifier plugin that routes AO events to OpenClaw’s /hooks/agent endpoint. It uses a local 127.0.0.1 transport with token authentication, prefixes session identifiers with hook:ao:, and applies exponential back‑off for HTTP 429/5xx responses.

Observed Metrics

6 merged pull requests (e.g., #337, #338, #339, #343, #346) and 2 open PRs.

4 active sessions, 6 sessions that died and were revived.

1 authentication crash that blocked new sessions for ~20 minutes.

No regressions introduced into the main branch.

Self‑Improvement Loop

Failures are turned into concrete backlog items (e.g., add session health monitoring, bind session IDs to tasks, improve auth handling). OpenClaw generates agents to implement these tasks, feeding the results back into AO for the next iteration.

Conclusion

The 18‑day evolution of AO demonstrates that large‑scale AI‑agent orchestration can achieve rapid, observable self‑improvement when combined with transparent reporting, human oversight, and a robust three‑tier escalation mechanism. The source code is available at https://github.com/ComposioHQ/agent-orchestrator.

AI agents self-improvement Agent orchestration OpenClaw Telegram Integration CI automation

Written by

High Availability Architecture

Official account for High Availability Architecture.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.