How RAG Evolved: From Naive to Agentic – A Complete Guide
This article systematically outlines the evolution of Retrieval‑Augmented Generation (RAG) from its naive three‑step pipeline to advanced, modular, and agentic architectures, highlighting each generation's motivations, core features, advantages, drawbacks, and practical implementation details for large language model applications.
Introduction
Retrieval‑augmented generation (RAG) equips large language models (LLMs) with external information retrieval, addressing knowledge limits, outdated data, and hallucinations.
RAG Evolution Process
The architecture has progressed through four generations: Naive RAG → Advanced RAG → Modular RAG → Agentic RAG.
Naive RAG
Basic pipeline consists of three stages: indexing, retrieval, and generation. It improves factuality by retrieving relevant documents and can be fine‑tuned end‑to‑end (RAG‑Sequence, RAG‑Token).
Advanced RAG
Introduces pre‑retrieval (query rewriting/expansion) and post‑retrieval (re‑ranking, prompt compression) to better align user queries with knowledge bases, reduce noise, and enhance generation quality.
Modular RAG
Decomposes the system into seven interchangeable modules: Indexing, Pre‑Retrieval, Retrieval, Post‑Retrieval, Memory, Generation, and Orchestration. This plug‑and‑play design supports linear, conditional, branching, and loop orchestration patterns, making it easy to integrate new components and iterate quickly.
Example operator for cosine similarity:
def _cosine_similarity(query_vec, doc_vec):
return np.dot(query_vec, doc_vec) / (np.linalg.norm(query_vec) * np.linalg.norm(doc_vec))Agentic RAG
Incorporates autonomous agents composed of LLM + Memory + Planning + Tools that make dynamic decisions about when to retrieve, which tools to invoke, and how to iterate, enabling multi‑step reasoning and adaptive retrieval.
Key variants include:
Single‑Agent Agentic RAG – a central router agent handling multiple retrieval sources.
Multi‑Agent Agentic RAG – specialized agents for structured data, unstructured data, web APIs, etc.
Hierarchical Agentic RAG – layered agents where higher‑level agents coordinate sub‑agents.
Agentic Corrective RAG – agents evaluate and correct retrieved documents before generation.
Adaptive Agentic RAG – agents decide at each stage whether retrieval or other actions are needed.
Graph‑Based Agentic RAG – combines graph retrieval with other modalities.
Summary
Architecture
Motivation
Main Features
Naive RAG
LLM hallucinations and static knowledge.
Three‑step pipeline; end‑to‑end fine‑tuning; improves factual QA.
Advanced RAG
Query‑knowledge mismatch and noisy retrieval.
Pre‑retrieval (query rewrite/expansion); Retrieval (semantic vectors); Post‑retrieval (re‑ranking, compression).
Modular RAG
Rapid component evolution and high maintenance cost.
Seven plug‑and‑play modules; routing, scheduling, fusion operators; supports linear, conditional, branching, loop patterns.
Agentic RAG
Complex multi‑step tasks and dynamic decision‑making.
Multi‑agent coordination; dynamic retrieval/generation decisions; iterative refinement; higher robustness for complex tasks.
Some Reflections
Generation module preprocessing and post‑processing are crucial for answer quality; consider query enhancement, document compression, and fine‑tuning.
Modular design simplifies experimentation, maintenance, and rapid integration of new retrieval or generation techniques.
Agentic approaches add flexibility but can be unstable; applying agents to limited sub‑tasks (e.g., answer refinement) may be more practical.
References
Retrieval‑augmented generation – Wikipedia.
What is RAG – AWS.
Retrieval‑Augmented Generation for Large Language Models: A Survey – arXiv 2312.10997.
Modular RAG: Transforming RAG Systems into LEGO‑like Reconfigurable Frameworks – arXiv 2407.21059.
Modular RAG and RAG Flow: Part Ⅰ – Medium.
Modular RAG and RAG Flow: Part Ⅱ – Medium.
Tree of Clarifications: Answering Ambiguous Questions with Retrieval‑Augmented Large Language Models – arXiv 2310.14696.
Agentic Retrieval‑Augmented Generation: A Survey – arXiv 2501.09136.
Haystack API documentation – https://docs.haystack.deepset.ai/reference/routers-api.
What is Agentic RAG – Weaviate blog.
Tencent Cloud Developer
Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
