How to Unlock Milvus 2.5’s Full‑Text and Hybrid Search for RAG Applications
This guide walks you through Milvus 2.5’s new full‑text, keyword‑matching, and hybrid search capabilities, showing how to set up the schema, install dependencies, prepare data with LangChain, and combine dense and sparse vectors for RAG‑enabled retrieval using Python.
Milvus 2.5, a high‑performance vector search engine fully compatible with open‑source Milvus, adds native full‑text search by integrating the Tantivy engine and Sparse‑BM25 algorithm, complementing its existing semantic search.
Background
Built‑in analyzer: Milvus can accept raw text, automatically performing tokenization, stop‑word filtering, and sparse‑vector extraction.
Real‑time BM25 statistics: Term frequency (TF) and inverse document frequency (IDF) are updated dynamically on data insertion, ensuring up‑to‑date relevance scores.
Hybrid search performance: Sparse‑vector retrieval based on ANN far outperforms traditional keyword systems, supporting billion‑scale data with millisecond latency while remaining compatible with dense‑vector queries.
Prerequisites
A Milvus instance running kernel version 2.5 (or later) and the Python SDK pymilvus version 2.5 (or later) are required. Create the instance via the quick‑start guide and obtain an API‑KEY.
Usage Limits
Applicable to Milvus instances with kernel version ≥2.5.
Applicable to pymilvus version ≥2.5.
Check the installed SDK version: pip3 show pymilvus If the version is lower than 2.5, upgrade it:
pip3 install --upgrade pymilvusStep 1: Install Dependencies
pip3 install pymilvus langchain dashscopeStep 2: Data Preparation
Using LangChain’s WebBaseLoader to load documentation, split it into chunks, and embed the text with DashScope’s text‑embedding‑v2 model. The resulting dense vectors and original text are then inserted into Milvus.
from langchain_community.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.embeddings import DashScopeEmbeddings
from pymilvus import MilvusClient, DataType, Function, FunctionType
dashscope_api_key = "<YOUR_DASHSCOPE_API_KEY>"
milvus_url = "<YOUR_MILVUS_URL>"
user_name = "root"
password = "<YOUR_PASSWORD>"
collection_name = "milvus_overview"
dense_dim = 1536
loader = WebBaseLoader([
'https://raw.githubusercontent.com/milvus-io/milvus-docs/refs/heads/v2.5.x/site/en/about/overview.md'
])
docs = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1024, chunk_overlap=256)
all_splits = text_splitter.split_documents(docs)
embeddings = DashScopeEmbeddings(model="text-embedding-v2", dashscope_api_key=dashscope_api_key)
text_contents = [doc.page_content for doc in all_splits]
vectors = embeddings.embed_documents(text_contents)
client = MilvusClient(uri=f"http://{milvus_url}:19530", token=f"{user_name}:{password}")
schema = MilvusClient.create_schema(enable_dynamic_field=True)
analyzer_params = {"type": "english"}
schema.add_field(field_name="id", datatype=DataType.INT64, is_primary=True, auto_id=True)
schema.add_field(field_name="text", datatype=DataType.VARCHAR, max_length=65535, enable_analyzer=True, analyzer_params=analyzer_params, enable_match=True)
schema.add_field(field_name="sparse_bm25", datatype=DataType.SPARSE_FLOAT_VECTOR)
schema.add_field(field_name="dense", datatype=DataType.FLOAT_VECTOR, dim=dense_dim)
bm25_function = Function(name="bm25", function_type=FunctionType.BM25, input_field_names=["text"], output_field_names="sparse_bm25")
schema.add_function(bm25_function)
index_params = client.prepare_index_params()
index_params.add_index(field_name="dense", index_name="dense_index", index_type="IVF_FLAT", metric_type="IP", params={"nlist": 128})
index_params.add_index(field_name="sparse_bm25", index_name="sparse_bm25_index", index_type="SPARSE_WAND", metric_type="BM25")
client.create_collection(collection_name=collection_name, schema=schema, index_params=index_params)
data = [{"dense": vectors[idx], "text": doc} for idx, doc in enumerate(text_contents)]
client.insert(collection_name=collection_name, data=data)
print(f"Generated {len(vectors)} vectors, dimension: {len(vectors[0])}")Note: Enabling the analyzer in the schema makes the setting permanent for the collection; changing the analyzer later requires recreating the collection.
Step 3: Full‑Text Search
Milvus 2.5 allows direct full‑text queries via the sparse_bm25 field.
from pymilvus import MilvusClient
client = MilvusClient(uri="http://c-xxxx.milvus.aliyuncs.com:19530", token="<yourUsername>:<yourPassword>", db_name="default")
search_params = {'params': {'drop_ratio_search': 0.2}}
full_text_search_res = client.search(
collection_name='milvus_overview',
data=['what makes milvus so fast?'],
anns_field='sparse_bm25',
limit=3,
search_params=search_params,
output_fields=["text"]
)
for hits in full_text_search_res:
for hit in hits:
print(hit)
print("
")Step 4: Keyword Matching
Enable both enable_analyzer and enable_match on the text field to create an inverted index for keyword matching.
# Example 1: Combine vector search with keyword filtering
filter = "TEXT_MATCH(text, 'query') and TEXT_MATCH(text, 'node')"
text_match_res = client.search(
collection_name="milvus_overview",
anns_field="dense",
data=query_embeddings,
filter=filter,
search_params={"params": {"nprobe": 10}},
limit=2,
output_fields=["text"]
) # Example 2: Scalar filtering with keyword match
filter = "TEXT_MATCH(text, 'scalable fast')"
text_match_res = client.query(
collection_name="milvus_overview",
filter=filter,
output_fields=["text"]
)Step 5: Hybrid Search & RAG
Combine dense vector search and BM25 text search using Reciprocal Rank Fusion (RRF) to improve recall and precision in Retrieval‑Augmented Generation (RAG) pipelines.
from pymilvus import MilvusClient, AnnSearchRequest, RRFRanker
from langchain_community.embeddings import DashScopeEmbeddings
from dashscope import Generation
client = MilvusClient(uri="http://c-xxxx.milvus.aliyuncs.com:19530", token="<yourUsername>:<yourPassword>", db_name="default")
collection_name = "milvus_overview"
dashscope_api_key = "<YOUR_DASHSCOPE_API_KEY>"
embeddings = DashScopeEmbeddings(model="text-embedding-v2", dashscope_api_key=dashscope_api_key)
query = "Why does Milvus run so scalable?"
query_embeddings = embeddings.embed_documents([query])
top_k = 5
search_params_dense = {"metric_type": "IP", "params": {"nprobe": 2}}
request_dense = AnnSearchRequest([query_embeddings[0]], "dense", search_params_dense, limit=top_k)
search_params_bm25 = {"metric_type": "BM25"}
request_bm25 = AnnSearchRequest([query], "sparse_bm25", search_params_bm25, limit=top_k)
reqs = [request_dense, request_bm25]
ranker = RRFRanker(100)
hybrid_search_res = client.hybrid_search(collection_name=collection_name, reqs=reqs, ranker=ranker, limit=top_k, output_fields=["text"])
context = []
print("Top K Results:")
for hits in hybrid_search_res:
for hit in hits:
context.append(hit['entity']['text'])
print(hit['entity']['text'])
def getAnswer(query, context):
prompt = f"""Please answer my question based on the content within:
{context}
My question is: {query}."""
rsp = Generation.call(model='qwen-turbo', prompt=prompt)
return rsp.output.text
answer = getAnswer(query, context)
print(answer)The above steps demonstrate how to configure Milvus 2.5 for full‑text, keyword, and hybrid searches, and how to integrate the results into a RAG workflow using Python.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Big Data AI Platform
The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
