CuaBot v1.0: A Third Way for AI Agents to Control Your Computer

CuaBot v1.0 introduces a new open‑source approach that lets AI agents interact with a desktop via independent cursors and sandboxed windows, avoiding full‑screen screenshots and mouse hijacking while supporting multi‑agent parallelism, H.265 video, audio, clipboard sharing, and a CLI built on Xpra and Docker.

AI Engineering
AI Engineering
AI Engineering
CuaBot v1.0: A Third Way for AI Agents to Control Your Computer

Current methods for letting AI agents operate a computer have drawbacks: direct desktop screenshot with mouse control can hijack the mouse and expose the whole screen, while cloud‑sandboxed desktops are secure but cumbersome to monitor and interact with.

CuaBot’s Solution

CuaBot, part of the open‑source Cua project, provides independent mouse cursors for the user and the AI within the same window, eliminating the need for full‑screen captures and preventing control conflicts.

The AI sees only the designated sandbox window, which supports H.265 video, audio, and clipboard sharing, delivering an experience close to native interaction.

Example Usage

$ npx cuabot claude
> "Write a two‑player tic‑tac‑toe game, then we play. I start first"

Claude launches a sandbox window where the user clicks their moves and the AI clicks its own, without interfering with each other. The AI can observe the game window but cannot affect other parts of the desktop.

Additional Capabilities

CuaBot can run multiple agents in parallel:

# Run agents concurrently
$ npx cuabot -n research openclaw
$ npx cuabot -n coding codex

# Control via CLI scripts
$ npx cuabot chromium &
$ npx cuabot --click 150 48
$ npx cuabot --type "I ❤️ Cua!"
$ npx cuabot --screenshot

Technical Implementation

The cuabot command starts the cuabotd daemon, which orchestrates an Ubuntu + Xpra Docker container, a multi‑cursor overlay, the Xpra MCP server, and a seamless Xpra client. Agents connect through MCP and exchange terminal I/O over WebSocket.

Only the required window is exposed to the agent, preventing desktop clutter from causing interference. All interactions occur inside the sandbox, ensuring security isolation.

Resources

GitHub: https://github.com/trycua/cua

Documentation: https://cua.ai/docs/cuabot/cuabot

npm package: https://www.npmjs.com/package/cuabot

CLIDockersandboxMulti-agentAI automationCuaBotXpra
AI Engineering
Written by

AI Engineering

Focused on cutting‑edge product and technology information and practical experience sharing in the AI field (large models, MLOps/LLMOps, AI application development, AI infrastructure).

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.