DataFunTalk
Feb 5, 2024 · Artificial Intelligence
Mobile-Agent: An Autonomous Multi‑Modal Mobile Device Agent with Visual Perception
The Mobile-Agent paper presents a vision‑only, autonomous multi‑modal AI system that can interpret user commands, locate UI elements on a smartphone screen, and execute complex tasks such as browsing, commenting, and content creation through a defined operation space, self‑planning, and self‑reflection mechanisms, achieving high success rates across diverse Chinese and English scenarios.
Mobile Automationautonomous operationmobile agent
0 likes · 7 min read