How to Build a High‑Quality AI Knowledge Base with FastGPT
This guide walks through the concepts, workflow, and practical steps for creating a high‑quality knowledge base in FastGPT using vector search, QA pair structuring, prompt engineering, and parameter tuning to improve LLM‑driven question answering.
Author: Yu Jinlong – FastGPT project lead, Sealos front‑end lead, former Shopee front‑end engineer. FastGPT repository: https://github.com/labring/FastGPT
Introduction
Since the release of ChatGPT in December last year, a wave of interactive applications has emerged, especially after the GPT‑3.5 API opened. However, issues such as controllability, randomness, and compliance limit many use cases.
Origin
In March, a tweet showed a low‑cost method of training a personal blog with GPT, providing a complete flowchart (see image). Inspired by this, the author added vector search to FastGPT within a month, releasing an early video demo.
Initial Development
After three months, FastGPT’s vector search and linear LLM QA features are complete, but no tutorial on building a knowledge base exists. This article aims to fill that gap.
FastGPT Knowledge Base Logic
Before building a knowledge base, understand FastGPT’s retrieval mechanism and basic concepts.
Basic Concepts
Vector : Converts human language (text, images, video) into a machine‑readable array.
Vector Similarity : Measures similarity between two vectors, indicating how alike the underlying content is.
LLM Characteristics : Context understanding, summarization, and reasoning.
These concepts combine into the formula “vector search + large model = knowledge‑base QA”. FastGPT stores QA pairs instead of raw text chunks, reducing vector length and improving search precision. It also offers search testing and dialogue testing for data adjustment.
FastGPT repository: https://github.com/labring/FastGPT
Creating a Knowledge‑Base Application
First, create a FastGPT FAQ knowledge base.
Acquiring Basic Knowledge
Extract existing FastGPT documentation (e.g., README) and split it into QA pairs.
QA Refinement
After extracting 11 QA groups from the README, some entries need manual correction, such as splitting combined resources into separate questions like “deployment tutorial” or “issue documentation”.
Next, create an application, link the knowledge base, and set a prompt that defines the knowledge‑base scope.
Import Community FAQs
Manually input community FAQ Q&A pairs to improve vector matching, allowing multiple phrasings for the same question.
FastGPT also provides an OpenAPI for preprocessing special file formats before uploading.
Knowledge‑Base Fine‑Tuning and Parameter Adjustment
FastGPT offers “search test” and “dialogue test” to fine‑tune the knowledge base.
Search Test
Enter a question to view returned QA data and assess retrieval quality.
If irrelevant results appear due to generic keywords, add those keywords to the relevant entries to boost similarity.
Prompt Setting
Prompt engineering follows two principles: (1) tell GPT what content to answer; (2) give a brief description of the knowledge base so the model can judge relevance.
Tell GPT the answer scope.
Provide a basic description of the knowledge base.
Limiting Model Conversation Range
Adjust similarity threshold (e.g., 0.82) and maximum results; set an empty‑search response to reply with a preset message when no match is found.
When Chinese keywords cause high similarity due to the OpenAI vector model’s bias, add limiting words such as:
If the question is not about FastGPT, reply: “I’m not sure.” Only answer based on the knowledge base.
Adjusting Knowledge Base via Dialogue
In the dialogue interface, click “Reference” to modify knowledge‑base entries on the fly.
Conclusion
Vector search compares text similarity.
Large models can summarize and reason to answer questions.
The most effective knowledge‑base construction combines QA pairs and manual curation.
Keep questions concise.
Prompt engineering guides the model to answer within the knowledge base.
Adjust similarity, result limits, and limiting words to control response scope.
Architect's Alchemy Furnace
A comprehensive platform that combines Java development and architecture design, guaranteeing 100% original content. We explore the essence and philosophy of architecture and provide professional technical articles for aspiring architects.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
