Mastering Advanced Retrieval: Fusion and Recursive Strategies for RAG

This article explores two advanced retrieval paradigms—Fusion Retrieval, which merges results from multiple retrievers using re‑ranking, and Recursive Retrieval, which builds hierarchical chunk‑to‑chunk or chunk‑to‑retriever links—to boost the quality and flexibility of Retrieval‑Augmented Generation pipelines.

AI Large Model Application Practice
AI Large Model Application Practice
AI Large Model Application Practice
Mastering Advanced Retrieval: Fusion and Recursive Strategies for RAG

Fusion Retrieval

Fusion Retrieval combines multiple retrieval methods to compensate for the limitations of a single vector index. It generates several query variants (e.g., via query rewriting or different retrieval algorithms) and runs each against one or more retrievers. The collected chunks are then re‑ranked—commonly with Reciprocal Rank Fusion (RRF)—to produce a final ordered list for downstream generation.

Typical ways to realize fusion include:

Rewriting the original question into multiple formulations and retrieving each separately.

Using heterogeneous index types (vector + keyword, knowledge‑graph, summary indexes) in parallel.

Combining different scoring algorithms on the same index.

Mixing the above approaches into a composite pipeline.

Key components in frameworks such as LangChain or LlamaIndex are QueryTransform (or other rewriter), Retriever , and Reranker . A custom fusion retriever can be built by wiring these pieces together.

class FusionRetriever(BaseRetriever):
    # Build a fusion retriever from multiple base retrievers
    def __init__(self, retrievers: List[BaseRetriever], similarity_top_k: int = 3):
        self._retrievers = retrievers
        self._similarity_top_k = similarity_top_k
        super().__init__()

    def _retrieve(self, query_bundle: QueryBundle) -> List[NodeWithScore]:
        # 1. Rewrite the query into several variants
        queries = rewrite_query(query_bundle.query_str, num=3)
        # 2. Run each query against all retrievers
        results_dict = asyncio.run(run_queries(queries, self._retrievers))
        # 3. Re‑rank using RRF (or another reranker)
        final_results = rerank_results(results_dict, similarity_top_k=self._similarity_top_k)
        return final_results

Recursive Retrieval

Recursive Retrieval constructs a hierarchy of chunks and retrievers (or RAG engines/Agents). A top‑level chunk links to a lower‑level chunk or a dedicated retriever, which is then invoked to fetch deeper knowledge. This process repeats until a termination condition (e.g., depth limit or sufficient relevance) is met.

Typical link types:

Chunk → another chunk (parent‑child relationship).

Chunk → a second‑level retriever.

Chunk → a RAG engine that returns an answer.

Chunk → an Agent capable of tool use and complex reasoning.

Use cases include:

Providing richer context by linking a short chunk to a larger parent chunk.

Answering summary‑style queries via a summary chunk that points to detailed chunks.

Handling hypothetical questions by linking to explanatory chunks.

Multi‑document QA: first retrieve relevant summary chunks, then recursively retrieve the underlying source chunks.

Complex table queries: create a dedicated RAG engine for extracted tables (e.g., using Pandas or SQL) and link table summary chunks to it.

When a chunk links to a retriever, the system performs a second‑level search on the linked index (e.g., a summary index) and feeds those results into the final generation step. When linking to a RAG engine or Agent, the downstream answer from that component becomes part of the generation context, offering richer tool‑driven capabilities.

Conclusion

Relying solely on a single vector‑based semantic search is insufficient for many production RAG scenarios. Understanding and applying advanced retrieval patterns such as Fusion Retrieval and Recursive Retrieval—supported by frameworks like LangChain and LlamaIndex—enables more accurate, flexible, and production‑ready AI applications.

Knowledge‑graph based retrieval

Keyword‑table retrieval

Vector‑BM25 hybrid

Multi‑retriever semantic routing

Automatic metadata filtering

Dynamic chunk‑size merging

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

LLMLangChainRAGvector searchLlamaIndexFusion RetrievalRecursive Retrieval
AI Large Model Application Practice
Written by

AI Large Model Application Practice

Focused on deep research and development of large-model applications. Authors of "RAG Application Development and Optimization Based on Large Models" and "MCP Principles Unveiled and Development Guide". Primarily B2B, with B2C as a supplement.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.