How AI Can Control Your Desktop: Inside the Open‑Source TuriX‑CUA Agent

TuriX‑CUA is an open‑source AI desktop agent that captures screen content, uses multimodal large models to decide actions, and automatically moves the mouse or types, offering cross‑platform support, multi‑model architecture, and detailed setup instructions for Windows and macOS.

Old Meng AI Explorer
Old Meng AI Explorer
Old Meng AI Explorer
How AI Can Control Your Desktop: Inside the Open‑Source TuriX‑CUA Agent

TuriX‑CUA (Computer Use Agent) is an open‑source Python‑based desktop automation agent that can control mouse clicks, keyboard input, and complex cross‑application workflows on both macOS and Windows. It follows a three‑step See‑Think‑Act loop:

See : captures the screen at regular intervals to obtain the current UI state.

Think : sends the screenshot to a multimodal large model, which decides the next action (e.g., click a button, type text).

Act : receives coordinates or keystroke instructions from the model and programmatically moves the mouse, clicks, or types.

The agent uses a “Planner + Executor” architecture. The planner (decision maker) decomposes a high‑level task into concrete steps, while the executor focuses on precise UI interactions, reducing erroneous clicks and improving overall task quality.

Cross‑Platform Support

Originally macOS‑only, Windows support was added in a later release. Users select the appropriate branch for their OS. Example capabilities include:

macOS: automate Safari searches, generate Pages documents, extract data from Discord, create charts, insert them into PowerPoint, and handle travel‑booking workflows.

Windows: automate YouTube searches and likes, and integrate with the MCP protocol to allow voice‑driven tools such as Claude for Desktop or Cursor to trigger full browser, Word, and WeChat automation.

Installation and Setup

1. Environment Preparation

conda create -n turix_env python=3.12  # create isolated environment
conda activate turix_env
git clone https://github.com/TurixAI/TuriX-CUA.git
cd TuriX-CUA
pip install -r requirements.txt

2. Model Configuration

Edit examples/config.json to choose a model. The default Turix API provides a free quota. To use a custom endpoint (e.g., a locally‑deployed Qwen3‑VL or an OpenAI‑compatible service), modify the build_llm function in main.py. Qwen3‑VL has been reported to perform well on UI element recognition.

3. System Permissions

Enable accessibility for the terminal and your IDE (e.g., PyCharm, VS Code) via System Settings → Privacy & Security → Accessibility.

For Safari automation, enable “Allow Remote Automation” in Safari’s Develop menu.

When the agent first runs, approve the system prompt that grants control of the computer; otherwise mouse movement will fail.

4. Running the Agent

Define a task in examples/config.json. Example for macOS:

{
  "agent": {
    "task": "打开Safari,搜索iPhone 17 Pro当前价格,打开备忘录记录结果"
  }
}

Start the agent: python examples/main.py The agent will automatically open Safari, enter the search query, retrieve the result, and record it in the Notes app without any manual intervention.

Demo of TuriX-CUA in action
Demo of TuriX-CUA in action

Community and Repository

The project is free and open source for personal and research use. Community support is available via Discord and email.

Project repository: https://github.com/TurixAI/TuriX-CUA

Pythonopen-sourcemultimodalAI automationcross‑platformdesktop agent
Old Meng AI Explorer
Written by

Old Meng AI Explorer

Tracking global AI developments 24/7, focusing on large model iterations, commercial applications, and tech ethics. We break down hardcore technology into plain language, providing fresh news, in-depth analysis, and practical insights for professionals and enthusiasts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.