Artificial Intelligence 4 min read

How Xiaomi’s XiaoAI Harnesses Large Models for Intent Routing and Response Generation

This article explains how Xiaomi’s XiaoAI assistant integrates large language models for intent distribution, vertical intent understanding, and response generation, detailing the architecture, challenges such as knowledge requirements and sub‑200 ms latency, and the shift from prompt engineering to model fine‑tuning that boosted user retention by 10% and query satisfaction by 8%.

DataFunSummit

Oct 5, 2025

How Xiaomi’s XiaoAI Harnesses Large Models for Intent Routing and Response Generation

1. XiaoAI Overview

XiaoAI is an ubiquitous AI assistant with product lines such as suggestions, voice, vision, translation, and calls, running on phones, speakers, TVs, and Xiaomi cars.

Following the launch of ChatGPT, the team decided to rebuild XiaoAI using large models. The new architecture improved product experience, raising next‑day active‑user retention by 10% and query‑satisfaction by 8%.

2. Large‑Model Intent Distribution

When a user query arrives, a large‑model intent‑distribution component determines the intent category and routes the query to a vertical agent specialized in that domain.

This separation reduces model iteration difficulty and speeds up development. Two main challenges are:

The model must possess sufficient knowledge (e.g., distinguishing “open settings” from “open air‑conditioner”).

Latency must stay below 200 ms.

Initial attempts used prompt engineering with few‑shot examples, but encountered mismatched intents and poor instruction following, especially with sub‑billion‑parameter models.

3. Large‑Model Vertical Intent Understanding

Each vertical agent hosts its own intent‑understanding large model, focusing on domain‑specific intents, which simplifies training and improves accuracy.

4. Large‑Model Response Generation

To address the limitations of prompt engineering, the team fine‑tuned the large model on defined intents and example queries, reducing token length and meeting latency requirements.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

artificial intelligence prompt engineering model fine-tuning AI assistant Intent Routing XiaoAI

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.