Technical Exploration of Intelligent Dialogue Robots in Didi Ride-Hailing Scenarios
The talk presents Didi AI Labs' research on intelligent dialogue robots for ride‑hailing, covering single‑turn QA, multi‑turn conversation, multi‑task learning architectures, model experiments, active learning pipelines, and the overall system design that integrates intent detection, slot extraction, dialogue management, and response generation.
This presentation, delivered at the 2019 AI Science Frontier Conference, introduces the technical exploration of intelligent dialogue robots in Didi ride‑hailing scenarios, focusing on single‑turn QA, multi‑turn conversation, and the overall system architecture.
Single‑turn QA aims to accurately recognize user questions and provide appropriate answers, but faces challenges such as limited annotated data, numerous business lines (over ten), and diverse user expressions.
To address data scarcity, the team investigated data migration between similar business lines (e.g., fast‑car and premium‑car) and proposed a Multi‑Task Learning framework that shares a semantic model across lines while retaining line‑specific models for fine‑tuning. Various model backbones (CNN, LSTM, Transformer, BERT) were evaluated.
Experimental results show that adding Multi‑Task learning improves Top‑1 and Top‑3 accuracy, especially for BERT, while complex models may overfit when data is insufficient.
Beyond classification, a search + semantic matching + ranking pipeline was built for emotion‑support scenarios, leveraging the DUA model (Deep Utterance Aggregation) to incorporate dialogue context.
The multi‑turn dialogue framework consists of three layers:
Intent Recognition – implemented with BERT + Multi‑Task Learning.
Slot Extraction – a hybrid rule‑based and model approach (BiLSTM + CRF).
Dialogue Management – state tracking, policy selection (state‑machine and reinforcement‑learning based), and response generation.
When user intent is ambiguous, an Intelligent Questioning module uses knowledge‑graph queries and guided clarification (e.g., asking whether the order is real‑time or scheduled) to obtain precise information.
Chit‑chat handling (greetings, thanks, etc.) is currently based on classification and retrieval models, with ongoing exploration of generative approaches for more flexible responses.
The overall robot architecture routes user requests through a frontend that aggregates query and context, then a ranker selects among answer types (QA, task‑oriented, multi‑turn, chit‑chat, graph‑based) to deliver the final response.
The presented solutions support various Didi services (taxi, fast‑car, premium‑car), driver assistants, and internationalized customer service, illustrating a comprehensive AI‑driven intelligent客服 system.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
