How to Unlock Milvus 2.5’s Full‑Text and Hybrid Search for RAG Applications

This guide walks you through Milvus 2.5’s new full‑text, keyword‑matching, and hybrid search capabilities, showing how to set up the schema, install dependencies, prepare data with LangChain, and combine dense and sparse vectors for RAG‑enabled retrieval using Python.

Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
How to Unlock Milvus 2.5’s Full‑Text and Hybrid Search for RAG Applications

Milvus 2.5, a high‑performance vector search engine fully compatible with open‑source Milvus, adds native full‑text search by integrating the Tantivy engine and Sparse‑BM25 algorithm, complementing its existing semantic search.

Background

Built‑in analyzer: Milvus can accept raw text, automatically performing tokenization, stop‑word filtering, and sparse‑vector extraction.

Real‑time BM25 statistics: Term frequency (TF) and inverse document frequency (IDF) are updated dynamically on data insertion, ensuring up‑to‑date relevance scores.

Hybrid search performance: Sparse‑vector retrieval based on ANN far outperforms traditional keyword systems, supporting billion‑scale data with millisecond latency while remaining compatible with dense‑vector queries.

Prerequisites

A Milvus instance running kernel version 2.5 (or later) and the Python SDK pymilvus version 2.5 (or later) are required. Create the instance via the quick‑start guide and obtain an API‑KEY.

Usage Limits

Applicable to Milvus instances with kernel version ≥2.5.

Applicable to pymilvus version ≥2.5.

Check the installed SDK version: pip3 show pymilvus If the version is lower than 2.5, upgrade it:

pip3 install --upgrade pymilvus

Step 1: Install Dependencies

pip3 install pymilvus langchain dashscope

Step 2: Data Preparation

Using LangChain’s WebBaseLoader to load documentation, split it into chunks, and embed the text with DashScope’s text‑embedding‑v2 model. The resulting dense vectors and original text are then inserted into Milvus.

from langchain_community.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.embeddings import DashScopeEmbeddings
from pymilvus import MilvusClient, DataType, Function, FunctionType

dashscope_api_key = "<YOUR_DASHSCOPE_API_KEY>"
milvus_url = "<YOUR_MILVUS_URL>"
user_name = "root"
password = "<YOUR_PASSWORD>"
collection_name = "milvus_overview"
dense_dim = 1536

loader = WebBaseLoader([
    'https://raw.githubusercontent.com/milvus-io/milvus-docs/refs/heads/v2.5.x/site/en/about/overview.md'
])

docs = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1024, chunk_overlap=256)
all_splits = text_splitter.split_documents(docs)

embeddings = DashScopeEmbeddings(model="text-embedding-v2", dashscope_api_key=dashscope_api_key)
text_contents = [doc.page_content for doc in all_splits]
vectors = embeddings.embed_documents(text_contents)

client = MilvusClient(uri=f"http://{milvus_url}:19530", token=f"{user_name}:{password}")

schema = MilvusClient.create_schema(enable_dynamic_field=True)
analyzer_params = {"type": "english"}

schema.add_field(field_name="id", datatype=DataType.INT64, is_primary=True, auto_id=True)
schema.add_field(field_name="text", datatype=DataType.VARCHAR, max_length=65535, enable_analyzer=True, analyzer_params=analyzer_params, enable_match=True)
schema.add_field(field_name="sparse_bm25", datatype=DataType.SPARSE_FLOAT_VECTOR)
schema.add_field(field_name="dense", datatype=DataType.FLOAT_VECTOR, dim=dense_dim)

bm25_function = Function(name="bm25", function_type=FunctionType.BM25, input_field_names=["text"], output_field_names="sparse_bm25")
schema.add_function(bm25_function)

index_params = client.prepare_index_params()
index_params.add_index(field_name="dense", index_name="dense_index", index_type="IVF_FLAT", metric_type="IP", params={"nlist": 128})
index_params.add_index(field_name="sparse_bm25", index_name="sparse_bm25_index", index_type="SPARSE_WAND", metric_type="BM25")

client.create_collection(collection_name=collection_name, schema=schema, index_params=index_params)

data = [{"dense": vectors[idx], "text": doc} for idx, doc in enumerate(text_contents)]
client.insert(collection_name=collection_name, data=data)
print(f"Generated {len(vectors)} vectors, dimension: {len(vectors[0])}")

Note: Enabling the analyzer in the schema makes the setting permanent for the collection; changing the analyzer later requires recreating the collection.

Step 3: Full‑Text Search

Milvus 2.5 allows direct full‑text queries via the sparse_bm25 field.

from pymilvus import MilvusClient

client = MilvusClient(uri="http://c-xxxx.milvus.aliyuncs.com:19530", token="<yourUsername>:<yourPassword>", db_name="default")
search_params = {'params': {'drop_ratio_search': 0.2}}
full_text_search_res = client.search(
    collection_name='milvus_overview',
    data=['what makes milvus so fast?'],
    anns_field='sparse_bm25',
    limit=3,
    search_params=search_params,
    output_fields=["text"]
)
for hits in full_text_search_res:
    for hit in hits:
        print(hit)
        print("
")

Step 4: Keyword Matching

Enable both enable_analyzer and enable_match on the text field to create an inverted index for keyword matching.

# Example 1: Combine vector search with keyword filtering
filter = "TEXT_MATCH(text, 'query') and TEXT_MATCH(text, 'node')"
text_match_res = client.search(
    collection_name="milvus_overview",
    anns_field="dense",
    data=query_embeddings,
    filter=filter,
    search_params={"params": {"nprobe": 10}},
    limit=2,
    output_fields=["text"]
)
# Example 2: Scalar filtering with keyword match
filter = "TEXT_MATCH(text, 'scalable fast')"
text_match_res = client.query(
    collection_name="milvus_overview",
    filter=filter,
    output_fields=["text"]
)

Step 5: Hybrid Search & RAG

Combine dense vector search and BM25 text search using Reciprocal Rank Fusion (RRF) to improve recall and precision in Retrieval‑Augmented Generation (RAG) pipelines.

from pymilvus import MilvusClient, AnnSearchRequest, RRFRanker
from langchain_community.embeddings import DashScopeEmbeddings
from dashscope import Generation

client = MilvusClient(uri="http://c-xxxx.milvus.aliyuncs.com:19530", token="<yourUsername>:<yourPassword>", db_name="default")
collection_name = "milvus_overview"

dashscope_api_key = "<YOUR_DASHSCOPE_API_KEY>"
embeddings = DashScopeEmbeddings(model="text-embedding-v2", dashscope_api_key=dashscope_api_key)

query = "Why does Milvus run so scalable?"
query_embeddings = embeddings.embed_documents([query])

top_k = 5
search_params_dense = {"metric_type": "IP", "params": {"nprobe": 2}}
request_dense = AnnSearchRequest([query_embeddings[0]], "dense", search_params_dense, limit=top_k)
search_params_bm25 = {"metric_type": "BM25"}
request_bm25 = AnnSearchRequest([query], "sparse_bm25", search_params_bm25, limit=top_k)

reqs = [request_dense, request_bm25]
ranker = RRFRanker(100)

hybrid_search_res = client.hybrid_search(collection_name=collection_name, reqs=reqs, ranker=ranker, limit=top_k, output_fields=["text"])

context = []
print("Top K Results:")
for hits in hybrid_search_res:
    for hit in hits:
        context.append(hit['entity']['text'])
        print(hit['entity']['text'])

def getAnswer(query, context):
    prompt = f"""Please answer my question based on the content within:
{context}
My question is: {query}."""
    rsp = Generation.call(model='qwen-turbo', prompt=prompt)
    return rsp.output.text

answer = getAnswer(query, context)
print(answer)

The above steps demonstrate how to configure Milvus 2.5 for full‑text, keyword, and hybrid searches, and how to integrate the results into a RAG workflow using Python.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

PythonLangChainRAGMilvusvector searchFull‑Text SearchHybrid Search
Alibaba Cloud Big Data AI Platform
Written by

Alibaba Cloud Big Data AI Platform

The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.