How Alibaba’s MRC Platform Powers Smart Q&A Across Industries
This article details Alibaba’s Machine Reading Comprehension (MRC) platform, covering its background, real‑world deployments in e‑commerce, government and IoT, the technical challenges of large‑scale, multi‑modal, multi‑language reading, and the solutions—from model architecture to productization—that enable intelligent question answering at enterprise scale.
Background
Reviewing human civilization, data has become a strategic resource after material and energy. McKinsey predicts knowledge automation will be a major disruptive technology, and building QA AI typically requires massive manual effort to create knowledge bases, which is costly and not scalable.
Machine Reading in Xiaomi
Since 2017, the Xiaomi team (Alibaba DAMO Academy) has focused on Machine Reading Comprehension (MRC), launching the first industrial MRC application. The SLQA model achieved Rank‑1 on SQuAD and surpassed human EM, enabling Chinese business data reading for activity rules, tax regulations, and more. Subsequent deployments cover manuals, retail, tourism, government, and other domains, handling million‑scale documents, long texts, multiple languages, modalities, and multi‑turn dialogues.
Alibaba Xiaomi & Store Xiaomi Activity Q&A
Machine reading powers activity rule interpretation for e‑commerce events, reducing manual effort. The solution has been used for Double‑11 and supports thousands of merchant activities.
Store Xiaomi Detail‑Page Image Q&A
Image‑based intelligent Q&A combines OCR, image tagging and captioning to extract answers from product pictures, improving user experience and increasing merchant conversion across multiple industries.
Tmall Flagship Store 2.0 Appliance Manual Q&A
PDF manual Q&A extracts answers from large‑scale appliance manuals, helping users quickly resolve after‑sale issues across more than 20 official flagship stores.
Hema Pre‑Sale Xiaomi Product Detail Q&A
For high‑price items like seafood, the pre‑sale Xiaomi retrieves answers from product images and encyclopedic articles, improving satisfaction and conversion without requiring manual FAQ maintenance.
Fliggy Travel Assistant Scenic Spot Q&A
The travel assistant uses MRC to answer questions from millions of travel articles, achieving zero‑shot model cold‑start for the Fliggy project.
Lazada Multi‑Language Machine Reading
A multilingual MRC model supports English, Indonesian, Vietnamese, Thai and other languages for Southeast Asian e‑commerce activities, enabling a single model to handle diverse language inputs.
AliOS Car Manual Q&A
The car assistant extracts answers from vehicle manuals, supporting functions, fault diagnosis, maintenance, and usage tips through multimodal voice‑image interaction.
Zhejiang Government Million‑Scale Regulation Reading
MRC answers citizen queries from two million government service articles without pre‑configured FAQs, dramatically improving service efficiency.
Party‑Group Service Center Long‑Form Article Reading
A 5‑minute robot answers questions from party history and policy documents exceeding 30,000 words, achieving high accuracy with minimal annotation.
Challenges in Business Scenarios
Academic MRC focuses on limited datasets, while real‑world applications face data annotation difficulty, large‑scale multi‑document retrieval, heterogeneous data formats, multimodal sources, multi‑turn dialogue, cross‑language requirements, cold‑start with few labels, answer refusal control, and FAQ‑MRC integration.
How to lower application thresholds and construction costs
How to handle massive text reading
How to avoid semantic loss in dialogue scenarios
Balancing accuracy and efficiency
How to quickly support multiple languages
How to build a data closed‑loop for rapid iteration
MRC Platform Technical Overview
4.1 Multi‑Turn Machine Reading with Context
Four solutions are implemented: entity inheritance, flow_ops end‑to‑end, query‑rewrite, and pre‑trained language‑model based approaches. These improve answer rates by up to 12% in activity‑zone scenarios.
Entity Inheritance
Missing entities in a query are inherited from the previous turn, enabling basic multi‑turn recall.
Flow Ops End‑to‑End Model
Integrating FlowQA‑style flow_ops into SLQA adds context‑aware answer extraction.
Query‑Rewrite Scheme
Encoder‑decoder rewrites multi‑turn queries, improving downstream performance.
Pre‑trained Language Model Scheme
RoBERTa is used as the backbone; K previous QA pairs are concatenated with the current query and document for span extraction and rationale tagging.
4.2 Multimodal Machine Reading Exploration
Modal conversion (OCR, image tagging, captioning) normalizes different modalities to text, while modal fusion integrates visual features (e.g., bounding boxes) into language models.
4.3 Multilingual & Cross‑Language Machine Reading
Initially, multilingual word embeddings and a shared encoder map different languages into a common semantic space. Later, Multi‑BERT with domain transfer and mixed‑language training enables a single model to handle Chinese, English, Indonesian, Vietnamese, etc.
4.4 Large‑Scale Multi‑Document Machine Reading
The pipeline consists of coarse document recall (BM25/Anserini), document re‑ranking (StructBERT), and answer extraction (cascade model). This approach achieved SOTA on MS‑MARCO and DuReader tasks.
4.5 Leveraging Multi‑Domain & Multi‑Task Learning
Parameter‑sharing strategies (fully, partially, inter/intra‑domain) and dynamic OOV expansion reduce annotation needs by up to 50% when transferring between domains such as retail promotions and telecom packages.
4.6 Language‑Model‑Based Technical Practices
Pre‑training on large unlabeled corpora (BERT, RoBERTa) followed by task‑specific fine‑tuning dramatically improves MRC performance. Model pruning and knowledge distillation (e.g., TinyBERT, ALBERT) reduce size and latency for online services.
4.7 Domain Data & Model Consolidation
Extensive domain data (retail, encyclopedia, regulations, tourism, etc.) and pre‑trained domain models enable rapid incremental training for new businesses.
From Technology to Productization
Initially, each business required dedicated manpower for model development, training, testing, and launch. As the platform matured, a self‑service MRC middle‑platform was built, offering domain model selection, data annotation, training, and deployment, with an automatic feedback loop for continuous iteration.
5.1 New Business Onboarding
Domain model selection reduces onboarding time from weeks to days, and data annotation costs drop dramatically.
5.2 Automatic Feedback & Iteration
Active learning, user feedback, and unsupervised distillation form a closed loop that automatically retrains, evaluates, and deploys models without manual intervention.
Development Timeline
2017: Academic research to industrial deployment (activity Q&A, tax interpretation). 2018: Expansion to multiple domains, multi‑language, multi‑modal, and full‑stack QA pipeline. 2019: Platformization, productization, and large‑scale multi‑document reading.
Technical Achievements
2017: First industrial deployment for activity Q&A.
2018: SQuAD v1.1 EM 82.44 (human‑level), TriviaQA first place, SQuAD v2.0 first place.
2018: Baidu DuReader Q&A first place, sub‑10 ms latency for activity rule Q&A.
2018: MS‑MARCO Q&A first place.
2019: MS‑MARCO Passage Retrieval and Q&A records broken.
2019: Multimodal reading launched for 7 industries, supporting Double‑11.
Publications in ACL 2018, AAAI 2019, EMNLP 2019, CIKM 2019, AAAI 2020.
Conclusion and Outlook
After three years, MRC technology is deployed across activity rules, regulations, product encyclopedias, and manuals, with transfer learning, multi‑task learning, and large‑scale pre‑training enhancing performance. Future work will focus on deeper multimodal understanding, reasoning, and expanding the platform to new domains, turning intelligent information retrieval into a foundational infrastructure for AI‑driven services.
References
Wang et al. 2018. Multi‑Granularity Hierarchical Attention Fusion Networks for Reading Comprehension and Question Answering.
Rajpurkar et al. 2016. SQuAD: 100,000+ Questions for Machine Comprehension of Text.
Reddy et al. 2018. CoQA: A Conversational Question Answering Challenge.
Bajaj et al. 2016. MS MARCO: A Human Generated Machine Reading Comprehension Dataset.
He et al. 2017. DuReader: a Chinese Machine Reading Comprehension Dataset from Real‑World Applications.
Nogueira et al. 2019. Document Expansion by Query Prediction.
Wang et al. 2019. StructBERT: Incorporating Language Structures into Pre‑training for Deep Language Understanding.
Devlin et al. 2018. BERT: Pre‑training of Deep Bidirectional Transformers for Language Understanding.
Yan et al. 2018. A Deep Cascade Model for Multi‑Document Reading Comprehension.
Huang et al. 2018. FlowQA: Grasping Flow in History for Conversational Machine Comprehension.
Liu et al. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach.
Machine Learning Department Carnegie Mellon University: Multimodal Deep Learning Course Slide.
Joshi et al. 2019. SpanBERT: Improving Pre‑training by Representing and Predicting Spans.
Jiao et al. 2019. TinyBERT: Distilling BERT for Natural Language Understanding.
Lan et al. 2019. ALBERT: A Lite BERT for Self‑supervised Learning of Language Representations.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
