Databases 13 min read

Unlock Powerful Hybrid Search with Milvus 2.5: Full-Text, BM25, and RAG Guide

This tutorial explains how to use Milvus 2.5's new full‑text, BM25 keyword matching, and hybrid search capabilities—including step‑by‑step setup, schema design, code examples, and RAG integration—to achieve high recall and precision in large‑scale AI vector retrieval scenarios.

Alibaba Cloud Big Data AI Platform

Aug 7, 2025

Unlock Powerful Hybrid Search with Milvus 2.5: Full-Text, BM25, and RAG Guide

With the rise of big data, information retrieval is crucial. Alibaba Cloud Milvus Vector Search Service (Milvus version) offers a high‑performance vector engine, fully compatible with open‑source Milvus, ideal for large‑scale AI vector similarity search. Version 2.5 adds full‑text, keyword matching, and hybrid search, improving recall and precision in multimodal and RAG scenarios.

Background

Milvus 2.5 integrates the high‑performance search library Tantivy and the Sparse‑BM25 algorithm, providing native full‑text search that complements semantic search.

Built‑in analyzer: text is tokenized, stop‑words filtered, and sparse vectors generated automatically.

Real‑time BM25 statistics: term frequency (TF) and inverse document frequency (IDF) are updated on data insertion.

Hybrid search performance: sparse vector retrieval based on ANN outperforms traditional keyword systems, supporting billion‑scale data with millisecond latency and mixing with dense vectors.

Prerequisites

Milvus instance with kernel version 2.5 (see quick‑start link).

Service enabled and API‑KEY obtained (see link).

Usage Limits

Applicable to kernel versions 2.5 and later.

Python SDK pymilvus version 2.5+ required.

Check installed version: pip3 show pymilvus If the version is lower than 2.5, upgrade:

pip3 install --upgrade pymilvus

Step 1: Install Dependencies

pip3 install pymilvus langchain dashscope

Step 2: Prepare Data

Load documents with LangChain, split into chunks, embed using DashScope text‑embedding‑v2 model, and insert both dense vectors and original text into Milvus.

from langchain_community.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.embeddings import DashScopeEmbeddings
from pymilvus import MilvusClient, DataType, Function, FunctionType

dashscope_api_key = "<YOUR_DASHSCOPE_API_KEY>"
milvus_url = "<YOUR_MMILVUS_URL>"
user_name = "root"
password = "<YOUR_PASSWORD>"
collection_name = "milvus_overview"
dense_dim = 1536

loader = WebBaseLoader(['https://raw.githubusercontent.com/milvus-io/milvus-docs/refs/heads/v2.5.x/site/en/about/overview.md'])
docs = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1024, chunk_overlap=256)
all_splits = text_splitter.split_documents(docs)
embeddings = DashScopeEmbeddings(model="text-embedding-v2", dashscope_api_key=dashscope_api_key)
text_contents = [doc.page_content for doc in all_splits]
vectors = embeddings.embed_documents(text_contents)

client = MilvusClient(uri=f"http://{milvus_url}:19530", token=f"{user_name}:{password}")
schema = MilvusClient.create_schema(enable_dynamic_field=True)
analyzer_params = {"type": "english"}
schema.add_field(field_name="id", datatype=DataType.INT64, is_primary=True, auto_id=True)
schema.add_field(field_name="text", datatype=DataType.VARCHAR, max_length=65535,
                 enable_analyzer=True, analyzer_params=analyzer_params, enable_match=True)
schema.add_field(field_name="sparse_bm25", datatype=DataType.SPARSE_FLOAT_VECTOR)
schema.add_field(field_name="dense", datatype=DataType.FLOAT_VECTOR, dim=dense_dim)

bm25_function = Function(name="bm25", function_type=FunctionType.BM25,
                        input_field_names=["text"], output_field_names="sparse_bm25")
schema.add_function(bm25_function)

index_params = client.prepare_index_params()
index_params.add_index(field_name="dense", index_name="dense_index",
                      index_type="IVF_FLAT", metric_type="IP", params={"nlist": 128})
index_params.add_index(field_name="sparse_bm25", index_name="sparse_bm25_index",
                      index_type="SPARSE_WAND", metric_type="BM25")

client.create_collection(collection_name=collection_name, schema=schema, index_params=index_params)

data = [{"dense": vectors[idx], "text": doc} for idx, doc in enumerate(text_contents)]
client.insert(collection_name=collection_name, data=data)
print(f"Generated {len(vectors)} vectors, dimension: {len(vectors[0])}")

Step 3: Full‑Text Search

Use the sparse BM25 field to perform keyword search.

search_params = {'params': {'drop_ratio_search': 0.2}}
full_text_search_res = client.search(collection_name='milvus_overview',
                                     data=['what makes milvus so fast?'],
                                     anns_field='sparse_bm25',
                                     limit=3,
                                     search_params=search_params,
                                     output_fields=["text"])
for hits in full_text_search_res:
    for hit in hits:
        print(hit)
        print("
")

Step 4: Keyword Matching

Enable enable_analyzer and enable_match in the schema to create an inverted index for keyword filtering.

filter = "TEXT_MATCH(text, 'query') and TEXT_MATCH(text, 'node')"
text_match_res = client.search(collection_name="milvus_overview",
                              anns_field="dense",
                              data=query_embeddings,
                              filter=filter,
                              search_params={"params": {"nprobe": 10}},
                              limit=2,
                              output_fields=["text"])

Scalar filtering example:

filter = "TEXT_MATCH(text, 'scalable fast')"
text_match_res = client.query(collection_name="milvus_overview",
                             filter=filter,
                             output_fields=["text"])

Step 5: Hybrid Search & RAG

Combine dense vector search and BM25 text search using Reciprocal Rank Fusion (RRF) to improve recall and precision.

from pymilvus import AnnSearchRequest, RRFRanker
from langchain_community.embeddings import DashScopeEmbeddings
from dashscope import Generation

query = "Why does Milvus run so scalable?"
query_embeddings = embeddings.embed_documents([query])

top_k = 5
search_params_dense = {"metric_type": "IP", "params": {"nprobe": 2}}
request_dense = AnnSearchRequest([query_embeddings[0]], "dense", search_params_dense, limit=top_k)

search_params_bm25 = {"metric_type": "BM25"}
request_bm25 = AnnSearchRequest([query], "sparse_bm25", search_params_bm25, limit=top_k)

reqs = [request_dense, request_bm25]
ranker = RRFRanker(100)

hybrid_search_res = client.hybrid_search(collection_name=collection_name,
                                         reqs=reqs,
                                         ranker=ranker,
                                         limit=top_k,
                                         output_fields=["text"])

context = []
for hits in hybrid_search_res:
    for hit in hits:
        context.append(hit['entity']['text'])
        print(hit['entity']['text'])

def getAnswer(query, context):
    prompt = f'''Please answer my question based on the content within:
```
{context}
```
My question is: {query}.'''
    rsp = Generation.call(model='qwen-turbo', prompt=prompt)
    return rsp.output.text

answer = getAnswer(query, context)
print(answer)

Note: Analyzer settings are permanent for a collection; changing the analyzer requires recreating the collection.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python RAG Milvus vector search Full-Text Search Hybrid search

Written by

Alibaba Cloud Big Data AI Platform

The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.