Building a Private Knowledge‑Base Q&A Agent with MCP: A Hands‑On Guide
This article explores the rapid rise of AI agents, explains the Model Context Protocol (MCP) standard, and provides a step‑by‑step implementation of a private knowledge‑base Q&A system using Python and Java, covering knowledge construction, retrieval, architecture, and deployment details.
Industry speculation points to 2025 as the "AI Agent" year, with open‑source large models catching up to commercial ones, shifting competition from model performance to innovative application scenarios.
AI applications have evolved from Chat to Retrieval‑Augmented Generation (RAG) and now to Agents, prompting the emergence of new development frameworks and standards such as the Model Context Protocol (MCP), which has recently been adopted by OpenAI.
Overall Process Design
The system consists of two main parts: knowledge‑base construction and knowledge retrieval.
1. Knowledge Base Construction
Text chunking: split documents while preserving completeness and semantic integrity.
FAQ extraction: generate frequently asked questions to supplement retrieval.
Import into the knowledge base and embed vectors for efficient search.
2. Knowledge Retrieval (RAG)
Question decomposition: break the input query into atomic sub‑questions.
Retrieval: perform vector search for text and hybrid (full‑text + vector) search for FAQs.
Content filtering: keep the most relevant results for answering.
Compared with naive RAG, this approach adds optimizations such as improved chunking, FAQ extraction, query rewrite, and hybrid retrieval.
Agent Architecture
The architecture is divided into three components:
Knowledge Store and FAQ Store, supporting both vector and full‑text hybrid search.
MCP Server, which provides four tool APIs for reading and writing the stores.
Function implementation layer that uses prompts and a large language model (LLM) to perform knowledge import, retrieval, and Q&A.
Implementation Details
All code is open‑source and split into two parts:
Python client: interacts with the LLM via MCP, using prompts to build the knowledge base, retrieve information, and answer questions.
Java server: built on Spring AI, using Tablestore as the underlying storage.
Knowledge Store with Tablestore
Simple to use: a single instance creation step, serverless operation, no capacity management.
Low cost: pay‑as‑you‑go pricing, automatic scaling to petabyte levels.
Full‑featured: supports full‑text, vector, scalar, and hybrid search.
MCP Server Tools
The server implements four tools (see image below).
Knowledge Base Construction
Using prompts, the system segments text and extracts FAQs, ensuring semantic consistency and comprehensive coverage.
Importing Knowledge Base and FAQ
Prompt‑driven import is straightforward; the following example shows the process.
Knowledge Retrieval
Retrieval is also prompt‑driven, involving question decomposition, independent searches of the knowledge base and FAQ store, and aggregation of the most relevant results.
Demo Steps
Create a Tablestore instance via the command‑line tool (see image).
Start the MCP Server after setting required environment variables (see image).
Import the knowledge base using knowledge_manager.py. Set the LLM API key first:
Run the retrieval command (see image).
Perform Q&A against the knowledge base (see image).
In line with the introductory view, this wave of AI‑agent technology mirrors the rapid growth seen during the Web 2.0 and mobile‑internet eras; as demand for new application forms explodes, new frameworks and standards will inevitably emerge, promising a fast‑evolving future for AI agents.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
