How IMA Scaled Its AI Knowledge Base from Monolith to Micro‑services
This article walks through the end‑to‑end design of IMA's AI‑driven knowledge base, covering its definition, core business flow, architecture evolution, data ingestion pipelines, management challenges, asynchronous processing, permission modeling, and the business value demonstrated by the prototype.
0. Introduction
In the era of Retrieval‑Augmented Generation (RAG) and Large Language Models (LLM), a knowledge base must evolve from a passive digital repository to an intelligent assistant that can understand and converse.
1. What Is a Knowledge Base?
A knowledge base is a digital warehouse for centralized information sharing, similar to wikis, shared documents, or project libraries. Traditional keyword‑search‑driven bases only retrieve static content, while AI‑enabled bases support semantic understanding and dialogue.
2. Core Business Process
The IMA knowledge base lifecycle consists of three stages: Knowledge Ingestion , Knowledge Management , and Knowledge Application .
3. Architecture Design
3.1 Knowledge Ingestion
The ingestion layer must be extensible and stable. Three major challenges were identified:
Support for diverse data formats (20+ types, e.g., PDF, Word, XMind, audio).
Avoid tight coupling between external formats and internal logic.
Handle bursty traffic ("ingestion spikes") without overloading parsers.
Solution 1 – Unified Internal Data Model
Define a standard internal representation that decouples external sources from the system:
Media // user‑visible object stored in the Media Center Chunk // low‑level unit for RAG indexing and retrievalAll incoming files are first converted to Media, then parsed into one or more Chunk objects.
Solution 2 – Isolate Change with a Two‑Layer Ingestion Pipeline
Separate the stable "Unified Access Layer" (creates Media) from the flexible "Parsing Layer" (produces Chunk). This isolates format‑specific logic and enables independent evolution.
Solution 3 – Asynchronous Spike‑Shaving
Introduce a message‑queue‑based async architecture to decouple front‑end ingestion requests from back‑end parsing. This smooths traffic spikes and prevents service overload.
3.2 Knowledge Management
Management operations (e.g., bulk edit, folder moves, deletions) involve multiple components and must remain consistent under high concurrency.
Solution – Service Decomposition
Split the system into atomic services (single‑purpose, stateless) and aggregated services (orchestrate complex workflows). This reduces coupling and improves scalability.
Data Consistency
Because Media and Chunk are processed asynchronously, temporary inconsistencies can appear. A dual‑guard mechanism is used: the Media status provides immediate visibility, while an asynchronous reconciliation service guarantees eventual consistency.
Permission Modeling
A multi‑level permission system protects data across personal, team, and enterprise scopes. The design follows deep modeling and a unified permission gateway, enabling fine‑grained access control and future extensibility.
3.3 Knowledge Application
After ingestion and management, the knowledge is consumed by AI‑driven services. The primary use case in IMA is RAG‑based Q&A, where user queries are answered by retrieving relevant Chunk data and feeding it to an LLM.
4. Results & Business Value
The prototype demonstrates that a modular, async‑first backend can handle diverse data formats, bursty ingestion, and strict consistency requirements while supporting AI‑enhanced retrieval. Early metrics show stable throughput under peak loads and reduced latency for RAG queries.
5. Summary
Architecture must evolve continuously; a solid design isolates change, embraces async processing, and enforces clear service boundaries. Practical value is measured by reduced development friction, higher system reliability, and the ability to deliver AI‑powered knowledge experiences.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Tencent Cloud Developer
Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
