Layered Knowledge Base Architecture: From RAG to Agent‑Native Knowledge Context Layer

The article analyses the structural shortcomings of naive Retrieval‑Augmented Generation (RAG), compares four knowledge‑base paradigms, proposes a five‑layer pyramid knowledge context that supports role‑aware navigation and incremental sync, and presents evaluation results showing the pyramid‑plus‑RAG approach significantly outperforms plain RAG.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
Layered Knowledge Base Architecture: From RAG to Agent‑Native Knowledge Context Layer

Fundamental problems of naive RAG

RAG (Retrieval‑Augmented Generation) splits documents into chunks, embeds them, and retrieves the top‑K chunks for each query. In engineering knowledge bases this approach suffers from three structural defects:

Zero accumulation : each query re‑derives knowledge from scratch, discarding any intermediate results (Karpathy, 2025).

Inability to connect the dots : flat vector search cannot link dispersed information or understand large‑scale semantics (Microsoft GraphRAG).

Granularity confusion : chunks of vastly different abstraction levels (e.g., design principles vs. line‑range code) are treated equally.

Typical symptoms are high‑frequency documents monopolizing Top‑K, mismatched abstraction levels, missing impact tracing, and lack of clear reading paths for newcomers. The root cause is the absence of structure; vector search treats knowledge as a bag of words rather than a tree or graph.

Knowledge‑base paradigms

Naive RAG – flat vector retrieval (chunk → embedding → vector DB → similarity search). Simple to implement but provides no accumulation, association, or hierarchy.

LLM Wiki – the LLM continuously compiles structured markdown pages (wiki) from source documents. Knowledge accumulates, but linking relies on manual wikilinks and lacks role adaptation.

Graphify – transforms heterogeneous artifacts (code, configs, docs) into a directed knowledge graph using two pipelines: an offline AST parser for code entities and an LLM‑driven semantic extractor for non‑code content.

GraphRAG – builds a knowledge graph, performs community clustering, generates hierarchical summaries, and combines graph‑based local search with global community summaries during query time.

Pyramid knowledge context layer

A five‑layer pyramid maps software‑engineering abstractions to knowledge nodes:

L1 Principles – SOLID / KISS / YAGNI (yearly stability, analogous to a constitution).

L2 Architecture – ADR records (quarterly stability, analogous to law).

L3 Standards – ESLint rules (monthly stability, analogous to regulations).

L4 Implementation – code templates, SDK docs (weekly/day stability, analogous to manuals).

L5 Experience – post‑mortems, ops logs (daily stability, analogous to case law).

Each document becomes a node; seven directed edge types encode cross‑layer relationships: governs (L1 → L2): principles constrain architecture decisions. defines (L1 → L2/L3): concept definitions. constrains (L2 → L3): architecture constraints on standards. implements (L2/L3 → L4): concrete implementation of architecture or standards. validates (L4 → L5): implementation generates operational experience. feedback (L5 → L3/L4): experience feeds back to improve standards and implementations. cross_ref (any → any): horizontal references across same or different layers.

Key design points:

Upward trace : from implementation back to governing principles.

Downward drill : from principle to concrete implementation.

Feedback loop : ops experience updates standards and implementations.

Path navigation : predefined reading routes (e.g., “New‑Hire Onboarding: L1→L2→L3→L5”).

Retrieval first selects the appropriate layer via keyword scoring, then expands through graph edges, dramatically shrinking the search space and reducing token consumption. All operations are local – no external API calls.

Synchronization mechanism

Knowledge bases decay in three forms:

Silent expiration : documentation lags behind code changes.

Layer drift : architectural decisions become historical background but remain in the architecture layer.

Coverage blind spots : new services are missing from implementation references.

A metric‑driven freshness model refreshes each layer on a different schedule (L1 yearly, L2 quarterly, L3 monthly, L4 weekly/day, L5 daily). Audits check coverage, freshness, graph connectivity, and layer balance.

Incremental sync proceeds in three phases:

Audit : scan coverage, detect expired docs, output gaps.

Ingest : load sources, chunk, classify, deduplicate (checksum + entry‑ID). Four actions handle content changes:

skip – unchanged content in the same layer.

update – modified content in the same layer (preserve creation time).

move – layer change (delete old, write new).

write – brand‑new items.

Post‑audit : compare before/after coverage to verify improvement.

Evaluation

Experiment setup

Knowledge base: 831 source docs covering 14 services and 5 domains.

Test set: 200 QA pairs spanning service location, architecture concepts, code details, ops troubleshooting, cross‑service links, and navigation.

Metrics: RAGAS framework – Hit@K, MRR, Context Precision, Context Recall, plus estimated Faithfulness and Answer Relevancy.

Retrieval modes:

A – Naive RAG (pure vector store).

B – Pipeline Skill (agentic pipeline with 7 stages).

C – Pyramid KB (hierarchical keyword + graph enhancement).

D – Pyramid + RAG hybrid (layer routing → vector retrieval).

E – LLM Wiki (compiled wiki with wikilink navigation).

F – Knowledge Graph (86 nodes / 214 edges with community clustering).

Key results (Hit@3)

D (Pyramid+RAG): Hit@1 32.5 %, Hit@3 89.0 % , Hit@5 89.5 % , MRR 53.7 %, Context Precision 0.405, Context Recall 0.636 .

A (Naive RAG): Hit@1 55.0 %, Hit@3 75.0 %, Hit@5 75.0 %, MRR 61.6 % , Context Precision 0.218, Context Recall 0.320.

F (Knowledge Graph): Hit@1 64.5 % , Hit@3 71.0 %, Hit@5 71.0 %, MRR 67.5 %, Context Precision 0.574 , Context Recall 0.317.

C (Pyramid KB): Hit@1 32.5 %, Hit@3 58.5 %, Hit@5 64.5 %, MRR 44.8 %, Context Precision 0.272, Context Recall 0.480.

B (Pipeline Skill): Hit@1 44.5 %, Hit@3 54.5 %, Hit@5 54.5 %, MRR 49.3 %, Context Precision 0.419, Context Recall 0.457.

E (LLM Wiki): Hit@1 31.0 %, Hit@3 40.0 %, Hit@5 40.0 %, MRR 35.4 %, Context Precision 0.242, Context Recall 0.400.

Per‑dimension analysis shows the hybrid Pyramid+RAG excels on code‑detail queries (≈99 % Hit@3) while maintaining strong performance on ops and architecture questions. Pure GraphRAG struggles with cross‑service association.

Limitations: single‑author evaluation, non‑blind, LLM‑generated test set, and a single‑team knowledge base. Future work includes larger multi‑team datasets, blind multi‑evaluator studies, and full API‑level retrieval.

Conclusion

The pyramid adds a structured routing and navigation layer on top of existing RAG or graph approaches. By combining hierarchical keyword scoring, role‑aware layer filtering, and graph‑based edge expansion, it dramatically improves precision while keeping token usage low. The ultimate goal is for a programmer to ask a question and receive a correct, context‑appropriate answer within three seconds, rather than a handful of unrelated snippets.

References

https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f
https://github.com/safishamsi/graphify
https://microsoft.github.io/graphrag/
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AILLMRAGKnowledge Baseknowledge-graph
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.