Didi's ChatBI: Evolution, Exploration, and Future of AI‑Powered Business Intelligence
This article details Didi's journey since early 2023 in building ChatBI, covering the evolution of BI platforms, the technical advances behind intelligent BI such as LLM‑driven NL2SQL, two main product paths, practical implementations, key challenges, and future directions for AI‑enhanced data analysis.
Didi's team has been dedicated to large‑model research since early 2023, aiming to upgrade data products and has achieved several milestones that are now in production.
The presentation is divided into four parts: the evolution of the ABI direction and the current state of ChatBI, Didi's own ChatBI exploration and practice, the future outlook for ChatBI, and a Q&A session.
BI products have progressed from report‑centric to self‑service and now to intelligent BI, with the core goal of lowering the barrier and cost for users to leverage data through AI‑driven insights.
Before 2023, enhanced analytics—such as intelligent charts, data interpretation, prediction, anomaly analysis, and attribution—relied on machine‑learning and rule‑engine techniques but saw limited hype.
Since 2023, many companies have explored intelligent QA and intent recognition; the rise of LLMs enables natural‑language data interpretation, prediction, and anomaly analysis, and Didi integrated many pre‑existing enhanced‑analytics features into ChatBI.
Two main exploration paths for ChatBI are identified: (1) building on a traditional BI platform with a Copilot layer, which benefits from existing data sets but suffers from non‑standardized data; (2) adopting an AI‑native metric platform that standardizes indicators, offering clearer query semantics at the cost of higher infrastructure investment.
Industry observations show that ChatBI is still in its early exploratory stage. Technically, user intent recognition is reliable, NL2SQL accuracy still needs improvement, and deep‑analysis model capabilities require significant enhancement. Vertically standard scenarios are deployed faster than non‑standard ones.
Didi's ChatBI practice includes the evolution of its BI platform—from visual reporting to a one‑stop reporting platform, then to self‑service analysis, and finally to an intelligent analysis platform. The product "Shu Xiao Zhi" offers Copilot, PC site, and IM mobile forms, with core functions such as data search, analysis, and SQL assistance.
Key technical considerations include improving NL2SQL accuracy (LLM upgrades from 10B to 33B to 72B, fine‑tuning data of tens of thousands, weekly online inspections), building a unified metric platform with over 10k standardized business metrics, and exposing metrics via APIs.
Product trust is enhanced through model refusal and clarification mechanisms, visual query filters, SQL visualization, and industry‑specific prompt support.
The NL2SQL workflow involves multi‑round question merging (≈98% accuracy), intent classification (≈96% accuracy), LLM processing (72B model), and finally DSL conversion and chart generation, achieving an overall end‑to‑end solution rate of about 85%.
Data asset standardization remains a long‑term challenge; non‑standard report datasets and Hive tables hinder accurate QA, necessitating a push for standardized metric collections.
Product habit formation focuses on flexible analysis touchpoints (e.g., dynamic filters, attribution analysis, field exploration) and deep value delivery through scenario‑based agents that incorporate business context.
Future directions emphasize layered product value beyond NL2SQL, deep analysis enabled by agent architectures, and the importance of embedding domain knowledge for targeted insights.
The Q&A covers infrastructure choices for rapid ChatBI deployment, the meaning of metric interpretation, and the rationale for converting NL2SQL results to DSL before final processing.
In summary, Didi's ChatBI combines AI, large‑model techniques, and metric standardization to advance intelligent business intelligence while acknowledging current technical and adoption challenges.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.