Design and Implementation of a Knowledge-Base Intelligent Q&A System for Database Operations Using Large Models
The paper details Baidu Intelligent Cloud’s design and deployment of a domain‑specific knowledge‑base Q&A system for database operations, combining prompt‑engineered LLMs with hybrid vector‑search using LangChain, BES vector store, and custom ingestion, addressing recall, token limits, and hallucination challenges across dashboard and IM bot interfaces.
This article, originating from Baidu Intelligent Cloud's database operations team, presents a detailed case study of building a knowledge‑base intelligent Q&A system powered by large language models (LLMs). It covers the overall technical solution, module designs, key challenges, and real‑world deployment scenarios.
1. Background – With the rapid development of large models, AI is becoming pervasive. In the database operations domain, the goal is to combine expert systems with native AI to help engineers quickly retrieve knowledge and make accurate operational decisions.
Traditional knowledge‑base systems rely on static rules, keyword search, and predefined tags, requiring users to have professional expertise. Such approaches no longer meet the needs of complex, dynamic operational environments.
2. Architecture Design and Implementation
2.1 Technical方案选型 – The team evaluated three main approaches: fine‑tuning, prompt engineering, and hybrid search‑plus‑LLM. They chose a combination of prompt engineering and hybrid search, using a vector database as external memory and LangChain as the development framework.
2.2 Module Design
Knowledge Ingestion – Documents (PDF, CSV, Markdown, web pages) are loaded via LangChain, Selenium, or BeautifulSoup. Text is split into short chunks using RecursiveCharacterTextSplitter and SpacyTextSplitter, preserving context.
Text Vectorization – Initially tried open‑source embeddings (GanymedeNil, m3e) but settled on Baidu’s Wenxin embeddings for better performance.
Storage – Vectors and metadata are stored in a vector database. After testing ElasticSearch, BES, Milvus, and PGVector, BES (Baidu ElasticSearch) was selected for its HNSW implementation and resource efficiency.
Data Retrieval – User queries are vectorized, cached with GPTCache, and the top‑10 similar chunks are retrieved from the vector DB.
Result Integration – Retrieved chunks are assembled into a prompt (respecting token limits) and sent to the LLM. The LLM generates the final answer, which is also stored in MySQL for conversation history.
3. Technical Challenges and Solutions
3.1 Low recall in vector DB – Improved text splitting (using Spacy, hierarchical chunking, title compensation) and combined title+content vectorization to boost recall.
3.2 Token length limits – Adopted model selection (ERNIE‑Bot‑turbo vs ERNIE‑Bot), prompt pruning, and a MapReduce‑style multi‑turn LLM calling to handle long texts.
3.3 Stale knowledge and hallucinations – Integrated keyword extraction, official documentation search, and a hybrid pipeline that combines search results with LLM reasoning to mitigate outdated or fabricated answers.
4. Application Scenarios – The system is deployed in two ways: Database Chat (integrated into the DBSC dashboard) and IM bots (WeChat, Feishu, etc.) for quick knowledge access.
5. Summary – Building a domain‑specific knowledge‑base using vector databases and LLMs is feasible but still faces challenges such as retrieval accuracy and handling long contexts. Continuous model upgrades, better document management, and research on retrieval techniques are essential for future improvements.
Baidu Geek Talk
Follow us to discover more Baidu tech insights.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.