How to Keep Your AI Knowledge Base Accurate and Up‑to‑Date
This guide explains the standardized workflow, documentation standards, and quality‑control mechanisms that business and product teams should follow to maintain a high‑quality AI knowledge base, ensuring agents always provide correct answers.
One‑Minute Overview of Your Responsibility
In a nutshell: Keep the Agent’s reference book correct, easy to find, and never outdated.
Agent answer quality = Knowledge‑base quality (60%) + Technical optimisation & other factors (40%)
↑
Your domainThree Roles in Knowledge‑Base Maintenance
Business Owner (Content Owner)
Provide original business materials.
Confirm accuracy and timeliness of content.
Initiate content change requests.
Product Owner (Quality Gatekeeper)
Define and maintain the classification system.
Review entries for compliance with writing standards.
Organise regular inspections and full‑cycle reviews.
Monitor knowledge‑base operational metrics.
Technical Owner (Platform Support)
Maintain the knowledge‑base platform and tools.
Handle chunking strategy and retrieval optimisation.
Support data analysis and anomaly investigation.
Document vs. QA Pair: Which to Choose?
Recommendation: Prefer whole documents, but make them “chunk‑friendly”.
Full Document (Preferred): Suitable for policies, SOPs, product manuals – promotes systematic knowledge.
QA Pair (Supplementary): Use for extremely high‑frequency issues or scenarios requiring specific phrasing (e.g., complaint handling).
Key trade‑offs:
Full Document
Low maintenance cost (direct updates).
Medium‑high retrieval accuracy (depends on document structure).
Risk: Poorly written docs may be split incorrectly, leading to partial answers.
QA Pair
High maintenance cost (manual extraction).
Very high retrieval accuracy (precise matching).
Drawback: Number explosion – hundreds of pairs increase cost and risk contradictions.
How the System Splits Documents
Split by headings ("#" or "##").
If a section is too long, split by empty lines (paragraphs).
If still too long, split by sentences.
One‑sentence takeaway: Titles are the primary chunking signal; your title hierarchy becomes the chunk hierarchy.
Four Essentials for a Good Knowledge Entry (Accurate, Complete, Clear, Alive)
Accurate (准)
Provide exact numbers and conditions; avoid vague terms.
❌ "Refunds usually arrive in a few days"
✅ "Refund arrival time: 1‑3 business days for WeChat/Alipay, 3‑7 days for bank cards, 7‑15 days for credit cards"Complete (全)
Each entry should answer a single user question without requiring external references.
❌ "Return policy: can return, time not too long, keep product intact."
✅ "Return policy:
- Applicable: all self‑operated goods
- Time limit: within 7 days after receipt
- Conditions: item unused, packaging intact, receipt provided
- Process: App → My Orders → Request Return → Choose reason → Submit"Clear (清)
Use lists or tables; avoid pronouns that lose meaning when the chunk is isolated.
❌ "New users get some discounts"
✅ "New users enjoy the following benefits within 7 days after registration:
- First order over 100 ¥ → 20 ¥ off
- Free 30‑day membership trial
(Note: benefits cannot be stacked)"Alive (活)
Write in the language users actually speak; include common user expressions.
❌ Title: "RMA Process" (users don’t know "RMA")
✅ Title: "Return & Refund Process"
Body: also mention "exchange", "refund", "RMA" as synonyms.Title Writing Guidelines
Title is the first retrieval gate. Use the template:
[Target Audience] + [Core Topic] + [Optional Condition]Examples:
Individual Users · App Registration Process
Enterprise Customers · Annual Contract Renewal Rules · 2026 Edition
All Channels · Return & Refund Policy · 7‑Day No‑Reason
Knowledge‑Base Classification System
Principles (MECE, max three levels, scene‑oriented, scalable) ensure fast matching.
MECE: Mutually exclusive, collectively exhaustive – each entry belongs to one category only.
Max three levels: Too deep makes lookup hard.
Scene‑oriented: Classify by user problem scenario, not internal department.
Scalable: Leave room for future categories.
Sample top‑level hierarchy (excerpt):
Knowledge Base
├─ 1. Product Knowledge
│ ├─ 1.1 Product Overview
│ ├─ 1.2 Usage Guides
│ └─ 1.3 Version Changes
├─ 2. Business Rules
│ ├─ 2.1 Pricing & Fees
│ ├─ 2.2 Promotions
│ └─ 2.3 Contracts
├─ 3. Service Processes
│ ├─ 3.1 Pre‑sale Consultation
│ ├─ 3.2 Order/Payment/Logistics
│ └─ 3.3 After‑sale (Return/Repair/Complaint)
…Maintaining the Classification
New product launch: Assess need for new category (Product Owner).
Quarterly review: Check balance of entries per category (Product Owner).
Many "hard‑to‑classify" entries: Re‑evaluate taxonomy (Product + Business Owners).
Adjustments should consider impact on existing entries and avoid frequent large‑scale changes.
Templates for Different Knowledge Types
Template A – Business Rule / Policy
# [Rule/Policy Name]
**Applicable Scope**: [Products/Channels/Customer Types]
**Effective Date**: YYYY‑MM‑DD
**Expiration Date**: YYYY‑MM‑DD or "Long‑term"
**Version**: vX.X
**Author**: [Name]
## Rule Description
- Applicable Conditions: …
- Specific Rules: …
- Exceptions: …
## Common Scenarios
| Scenario | Rule Applied | Handling |
|---|---|---|
| … | … | … |
## User Questions
- "[Typical Question 1]?"
- "[Typical Question 2]?"
- "[Typical Question 3]?"
## Standard Answer
**Q**: "[User Question]"
**A**: "[Agent Response]"Template B – FAQ
# [Topic] FAQ
**Category**: [Category]
**Last Updated**: YYYY‑MM‑DD
**Author**: [Name]
---
**Q**: [Question 1]?
**A**: [Answer 1]
---
**Q**: [Question 2]?
**A**: [Answer 2]
---
**Q**: [Question 3]?
**A**: [Answer 3]Template C – SOP
# [Process Name] Guide
**Scope**: [Who uses it and when]
**Effective Date**: YYYY‑MM‑DD
**Version**: vX.X
**Author**: [Name]
## Overview
[1‑2 sentences describing purpose]
## Preconditions
- …
- …
## Steps
### Step 1: …
[Detailed actions]
### Step 2: …
[Detailed actions]
…
## Completion Criteria
[How to know the process succeeded]
## Exception Handling
| Exception | Possible Cause | Remedy |
|---|---|---|
| … | … | … |9‑Item Self‑Check Before Publishing
□ 1. Title clearly shows the topic
□ 2. Covers a single theme
□ 3. Information is accurate (numbers, dates, conditions)
□ 4. Understandable without external references
□ 5. Completes the user’s need in one read
□ 6. Uses lists or tables instead of large paragraphs
□ 7. Includes at least three common user phrasings
□ 8. Shows update date and author
□ 9. No contradictions with existing entriesWeekly 15‑Minute Inspection
Identify 10 cases where the Agent answered poorly (user complaints, low‑score Langfuse answers, colleague feedback).
Diagnose the error:
Is the knowledge missing?
Is the existing knowledge incorrect or outdated?
Is the entry hard to find (bad title or wording)?
Fix it:
Add missing entry using a template.
Correct wrong entry and update version/date.
Improve discoverability by adding user‑centric queries.
High‑Frequency FAQ
Q1: How long should a knowledge‑base entry be?
Answer: Keep the main body between 200‑800 characters . Too short lacks completeness; too long risks being split and losing context. Split long topics into multiple entries.
Q2: How to handle a question with multiple scenarios?
Use conditional branches within the same entry, e.g.:
Refund arrival time depends on payment method:
- WeChat/Alipay: 1‑3 business days
- Bank card: 3‑7 business days
- Credit card: 7‑15 business daysIf scenarios differ greatly (>500 characters), create separate entries.
Q3: Should old entries be deleted?
Answer: Archive instead of delete. Archived entries are removed from the Agent’s retrieval range but retained for audit.
Q4: Can internal documents be stored?
Only store user‑facing knowledge that the Agent needs to answer questions. Internal processes or org charts should be excluded unless directly required for user queries.
Q5: Are tables and images allowed?
Tables: Use Markdown or plain‑text tables – the Agent can parse them well.
Images: Most systems cannot retrieve image content; convert key information to text. If the answer itself is an image, it can be stored.
Q6: Do entries need a "standard answer"?
Strongly recommended for high‑frequency or scripted scenarios (e.g., complaints, price negotiations) to teach the Agent the exact phrasing.
Q7: What if business and product teams disagree?
Factual content (price, policy, process): follow business input.
Expression style (wording, structure): follow product guidelines.
Disputes: Escalate to joint leadership; prioritize the best outcome for the user.
Final reminder: Focus first on the 20 % of questions that cover 80 % of user needs. Write one solid entry at a time; the Agent will improve gradually.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Yunqi AI+
Focuses on AI-powered enterprise digitalization, sharing product and technology practices. Covers AI use cases, technical architecture, product design examples, and industry trends. Aimed at developers, product managers, and digital transformation professionals, providing practical solutions and insights. Uses technology to drive digitization and AI to enable business innovation.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
