Artificial Intelligence 18 min read

Intelligent Question Answering Technology in Baidu Search: Development, Modeling, and Retrieval‑Enhanced Generation

The article surveys Baidu Search’s intelligent question‑answering system, tracing its evolution from feature‑engineered retrieval to large pre‑trained and generative models, and detailing hierarchical readers, multi‑teacher distillation, retrieval‑enhanced generation, and instruction decomposition as key techniques for delivering fast, accurate, citation‑rich answers.

Baidu Tech Salon
Baidu Tech Salon
Baidu Tech Salon
Intelligent Question Answering Technology in Baidu Search: Development, Modeling, and Retrieval‑Enhanced Generation

This article introduces the intelligent question answering (QA) technology deployed in Baidu Search, covering the evolution of machine QA, current applications, and future research directions.

What is Machine QA? Machine QA enables a software system to automatically answer descriptive human questions. For example, entering a natural‑language query about a TV program into Baidu Search returns the answer directly in the top result, without needing to click through web links.

Machine QA differs from traditional keyword‑based search by providing direct answers, which improves information‑retrieval efficiency. Approximately 40% of search demand and 30% of dialogue demand are related to machine QA.

Development Timeline of Machine QA

Before 2013: Feature‑engineering approaches (e.g., BM25) using lexical matching between question and candidate answers.

2014‑2015: Deep‑learning models (CNN, RNN) compute semantic similarity.

2016‑2017: Attention‑based networks capture deeper semantic relations.

2018‑2021: Large pre‑trained models (e.g., BERT, ERNIE) are fine‑tuned for complex matching tasks.

Since 2022: Focus shifts to generative models.

Dataset Evolution

2013: MCTest (multiple‑choice & cloze).

2016: SQuAD – large reading‑comprehension dataset.

2017: DuReader – first Chinese reading‑comprehension dataset.

2018 onward: HotpotQA, etc., for multi‑hop and commonsense reasoning.

Retriever + Reader Paradigm

Retriever fetches candidate documents based on the query; Reader extracts or generates the answer from those candidates. Baidu Search acts as a strong Retriever, so research focuses on improving the Reader.

Early Reader Pipelines relied on complex feature‑engineering pipelines: query analysis → candidate retrieval → handcrafted matching features → ranking → answer extraction. This pipeline suffers from error accumulation and high maintenance cost.

Machine Reading Comprehension (MRC) replaces the pipeline with an end‑to‑end model that directly predicts answer spans (e.g., BiDAF, BERT‑based models).

Challenges in Search‑Driven QA

Complex semantic understanding, reasoning, and context modeling.

High traffic and the need for fast response with large models.

Open‑domain web data is noisy; ensuring answer correctness and quality is difficult.

Proposed Solutions

Hierarchical Modeling: Long‑sequence input with sentence‑level CLS tokens, hierarchical layers to capture deep context, and dual output heads for summary‑style answers and entity extraction.

Knowledge Distillation: Multi‑teacher, multi‑stage distillation (teacher training → unsupervised voting‑based distillation → supervised teacher‑weighting) to obtain a compact student model with performance comparable to larger teachers.

Retrieval‑Enhanced Generation (RAG): Combine search results with a generative LLM to mitigate hallucinations, improve timeliness, and increase trustworthiness. The workflow includes document retrieval, answer extraction, prompt construction (including source citations), and answer generation.

Training Stages for the Large Generative Model

General pre‑training on diverse web, book, table, and dialogue corpora.

Instruction fine‑tuning to understand user commands.

Business‑specific instruction fine‑tuning for multi‑result answer organization in search scenarios.

Reinforcement learning and user‑feedback fine‑tuning to improve answer quality.

Instruction Decomposition (Chain‑of‑Thought for Commands) – Complex search‑QA instructions are broken into three simple steps: (1) select relevant search results, (2) organize and generate the answer, (3) add numbered citations. Training the model on many simple instructions enables it to handle complex commands without massive instruction‑annotation data.

In summary, delivering comprehensive, efficient, and accurate answers in Baidu Search requires a combination of strong retrieval, advanced hierarchical readers, multi‑teacher distillation, retrieval‑enhanced generation, and instruction decomposition. The article concludes with an open question about the future shape of search engines and invites collaboration and resume submissions.

Large Language ModelsRetrieval-Augmented Generationknowledge distillationBaidu Searchmachine QA
Baidu Tech Salon
Written by

Baidu Tech Salon

Baidu Tech Salon, organized by Baidu's Technology Management Department, is a monthly offline event that shares cutting‑edge tech trends from Baidu and the industry, providing a free platform for mid‑to‑senior engineers to exchange ideas.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.