Mastering LangChain Retrievers: From Basics to Multi‑Query and Ensemble Techniques
This article explains the concept of LangChain retrievers, compares them with vector stores, and provides step‑by‑step code examples for MultiQueryRetriever and EnsembleRetriever, showing how to boost recall and combine semantic and keyword search for robust RAG pipelines.
What is a Retriever?
A retriever is a generic interface that receives a query string and returns a list of relevant documents. Its core method is invoke(query: str) -> List[Document]; the older get_relevant_documents method is deprecated.
Retriever vs. Vector Store
Separation of concerns : Vector stores focus on storing and searching vectors, while retrievers focus on fetching relevant documents for a query, a broader concept.
Universal interface : Although most retrievers are built on vector similarity search, they can be backed by any logic, such as SQL SELECT statements, BM25 search APIs, or direct file reads.
LCEL integration : Retrievers implement the Runnable interface, allowing them to be composed naturally in LangChain Expression Language (LCEL) chains, which vector stores alone cannot do.
Example 2: MultiQueryRetriever
This example shows how to boost recall by letting an LLM rewrite the original query into several perspectives and then merging the results.
import os
os.environ["TOKENIZERS_PARALLELISM"] = "false"
from typing import List
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
from langchain.retrievers.multi_query import MultiQueryRetriever
documents = [
"苹果公司在2023年9月发布了iPhone 15系列",
"iPhone 15 Pro采用了钛金属边框,大大减轻了重量",
"新款iPhone支持USB-C接口,告别了Lightning接口",
"iPhone 15系列的主摄像头升级到了4800万像素",
"Pro机型支持可编程的操作按钮,取代了静音开关",
]
embeddings = HuggingFaceEmbeddings(
model_name="shibing624/text2vec-base-chinese",
model_kwargs={'device': 'cpu'},
encode_kwargs={'normalize_embeddings': True}
)
vectorstore = FAISS.from_texts(documents, embeddings)
llm = ChatOpenAI(model="deepseek-v3", temperature=0)
retriever = MultiQueryRetriever.from_llm(
retriever=vectorstore.as_retriever(),
llm=llm
)
template = """根据以下文档回答问题:
{context}
问题: {question}
回答:"""
prompt = ChatPromptTemplate.from_template(template)
chain = (
{"context": retriever, "question": lambda x: x}
| prompt
| llm
)
question = "新iPhone有哪些硬件变化?"
response = chain.invoke(question)The MultiQueryRetriever generates multiple query variants such as:
"iPhone 15 的硬件规格和参数有哪些更新?"
"新 iPhone 在外观设计和物理结构上有什么改变?"
"iPhone 15 系列在接口和按键设计上做了哪些调整?"
These variants are executed in parallel, merged, and de‑duplicated to improve coverage.
Example 3: EnsembleRetriever
This example combines a vector retriever and a BM25 keyword retriever to leverage both semantic similarity and exact term matching.
import os
os.environ["TOKENIZERS_PARALLELISM"] = "false"
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_community.retrievers import BM25Retriever
from langchain.retrievers import EnsembleRetriever
from langchain_core.documents import Document
texts = [
"苹果公司在2023年9月发布了iPhone 15系列",
"iPhone 15 Pro采用了钛金属边框,大大减轻了重量",
"新款iPhone支持USB-C接口,告别了Lightning接口",
"iPhone 15系列的主摄像头升级到了4800万像素",
"Pro机型支持可编程的操作按钮,取代了静音开关",
]
documents = [Document(page_content=t) for t in texts]
embeddings = HuggingFaceEmbeddings(
model_name="shibing624/text2vec-base-chinese",
model_kwargs={'device': 'cpu'},
encode_kwargs={'normalize_embeddings': True}
)
vectorstore = FAISS.from_documents(documents, embeddings)
vector_retriever = vectorstore.as_retriever(search_kwargs={"k": 2})
bm25_retriever = BM25Retriever.from_documents(documents)
bm25_retriever.k = 2
ensemble_retriever = EnsembleRetriever(
retrievers=[vector_retriever, bm25_retriever],
weights=[0.7, 0.3]
)
query = "iPhone硬件升级"
results = ensemble_retriever.invoke(query)
for doc in results:
print(doc.page_content)Advantages of the ensemble approach:
Complementarity: semantic similarity from the vector retriever and precise term matching from BM25.
Flexibility: weights can be tuned for specific use‑cases.
Robustness: reduces bias introduced by a single retrieval method.
It is especially suitable for scenarios that require high retrieval quality, diverse document collections, or a balance between semantic relevance and keyword precision.
References
MultiQueryRetriever usage guide
How to combine multiple retrievers
How to use vector stores for retrieval
How to add similarity scores to retrieval results
BirdNest Tech Talk
Author of the rpcx microservice framework, original book author, and chair of Baidu's Go CMC committee.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
