Xiaomi Agent Technology: Architecture, Prompt Management, and Evaluation
This article presents Xiaomi's work on LLM‑based Agent technology, covering its perception‑thinking‑action pipeline, technical framework, prompt management, executor and API platform, workflow, optimization strategies, evaluation metrics, and future directions for AI assistants.
The presentation introduces Xiaomi's Agent technology, describing how the system perceives multimodal inputs, performs reasoning with knowledge injection and memory recall, and executes actions via tool calls, forming a perception‑thinking‑action loop for the Xiao AI assistant.
It then details the technical framework, including an NLU parser, prompt construction, a memory bank, and an API platform that isolates large models from business logic, enabling modular plugin integration.
The Prompt Manager component is explained, showing how system, scenario, user request, summarized history, and output format prompts are assembled using JSON schemas and multi‑way retrieval to guide the model.
Executor and API Platform responsibilities are outlined, with the executor parsing JSON outputs, normalizing data, and handling post‑processing, while the API platform registers plugins and provides a unified interface for tool invocation.
The Agent workflow diagram illustrates the iterative planning, tool selection, execution, summarization, and memory update cycle that resolves user queries.
Optimization efforts include model size reduction, LoRA adapters for task‑specific experts, and speculative sampling to accelerate JSON‑heavy generation.
Evaluation metrics such as success rate, relative efficiency, and successful‑step ratio are introduced, along with benchmark results on real‑world tasks like search‑based Q&A, travel planning, and complex comparison queries.
Future directions focus on tighter system integration, cross‑domain and cross‑device collaboration, and extending the agent to multimodal robot platforms.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.