Can Screen‑Recording Teach AI to Replace $3M‑Salary Forward Deployment Engineers?

The article examines how AI desktop agents like Agivar use screen‑recording teaching to let ordinary users train AI to execute complex workflow tasks, offering speed gains, deterministic execution, and a potential alternative to costly Forward Deployment Engineers.

Machine Heart
Machine Heart
Machine Heart
Can Screen‑Recording Teach AI to Replace $3M‑Salary Forward Deployment Engineers?

AI is moving from answering questions to performing real work on computers, with agents such as Anthropic’s Claude Cowork and OpenAI’s Codex desktop version attempting to automate form‑filling, system recording, and file organization. However, the prevailing model—"write a prompt, AI executes"—fails for most users because describing detailed, multi‑step internal processes in text is difficult.

To bridge this gap, a new role called Forward Deployment Engineer (FDE) emerged: specialists who translate ambiguous human workflows into AI‑executable tasks. Senior FDEs command median salaries of $485,000, highlighting the high cost of manually bridging the gap.

Non‑technical users need a different approach. The Tsinghua‑affiliated startup Fei Shi Technology (非十科技) released Agivar, a desktop‑agent product that learns directly from a screen‑recorded demonstration. Users simply record themselves performing a workflow; the AI then observes, extracts the underlying tasks and logic, and can replay the process autonomously.

Unlike traditional screen‑recording tools that capture only coordinates and clicks, Agivar captures the task and logic : it learns why a page is opened, why a value is entered, and under what conditions a step can be skipped. Consequently, it understands the workflow’s rules rather than merely reproducing mouse movements, allowing it to adapt to UI changes.

In a real‑world case, a staff member of a Guangdong provincial government office spent 1–2 hours daily on repetitive form entry in a system without APIs. After a single three‑minute recording with Agivar, the same process ran automatically, saving roughly two hours per day.

Performance benchmarks show Agivar completing the same backend data‑entry task in 57 seconds, compared with 2 minutes 12 seconds for a competing product—more than a two‑fold speed improvement. While a one‑minute gain may seem modest, scaling to hundreds of forms or daily batch approvals translates into hour‑level productivity gains.

Speed is not the only concern; deterministic execution is crucial for production use. General‑purpose multimodal models introduce variability (e.g., clicking different buttons on successive runs), which is unacceptable for finance or contract processing. Agivar addresses determinism through a three‑layer design:

Training convergence : massive desktop‑task data is used to stabilize the mapping from interface state to user intent and action.

Multi‑stage verification : separate agents cross‑check planning, execution, observation, and review, each asking “Is the click correct? Is the interface in the expected state?”

Rule constraints : critical steps are encoded as hard‑coded programmatic rules that the system must follow without deviation.

The architecture mirrors the human nervous system: a large “brain” model handles understanding, task decomposition, planning, and exception handling, while a specialized “cerebellum” model performs UI recognition, mouse clicks, keyboard input, and high‑frequency actions. This dual‑model system runs on the Jittor (计图) deep‑learning framework, which the team developed in‑house, enabling tight control over inference scheduling and low‑latency execution.

Agivar’s development is fully in‑house at Fei Shi Technology, whose core team includes Tsinghua University computer‑science PhDs and primary contributors to Jittor. Their previous product, the AI coding assistant Fitten Code, achieved over 1.5 million downloads and top ratings on major plugin platforms, demonstrating the team’s capability in large‑model development and product deployment.

By allowing users to teach AI through a single screen‑recording, Agivar removes the need to learn prompt engineering or alter existing habits, offering a more accessible path for ordinary workers to integrate AI into their daily workflows. The product is currently in public beta for Windows and macOS, with a download link provided.

Overall, the shift from prompting to demonstration‑based learning could democratize AI‑assisted automation, turning every employee into their own “AI FDE” and potentially sparking a broader efficiency revolution.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AI Automationscreen recordingdesktop agentdeterministic executionJittorworkflow learning
Machine Heart
Written by

Machine Heart

Professional AI media and industry service platform

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.