How DiskANN + RaBitQ Supercharges Milvus: 5× Faster, 90% Cheaper Vector Search

This article explains how integrating the disk‑based DiskANN index with the ultra‑compact RaBitQ quantization dramatically boosts Milvus's vector search performance and cuts costs, delivering over five times higher QPS and more than 90% cost reduction for billion‑scale AI workloads.

Volcano Engine Developer Services
Volcano Engine Developer Services
Volcano Engine Developer Services
How DiskANN + RaBitQ Supercharges Milvus: 5× Faster, 90% Cheaper Vector Search

Overview

Vector retrieval is entering an era that demands both high recall, low latency, and scalability to billions of vectors with controllable cost. In Volcano Cloud Search we introduced DiskANN, a disk‑based vector index that stores vectors on disk while keeping only graph files in memory, reducing vector search cost by over 90%.

DiskANN also supports pure‑memory performance mode; combined with the SOTA RaBitQ quantization algorithm, memory usage is reduced by 85%, delivering extreme cost‑performance for high‑throughput vector search, now deployed at scale.

Integration of DiskANN + RaBitQ into Volcano Milvus

We integrated DiskANN + RaBitQ into the Volcano Engine vector database Milvus edition ("Volcano Milvus"). It retains Milvus’s rich open‑source ecosystem while delivering higher QPS and lower cost for billion‑scale data; disk‑based index QPS is more than 5× that of the community edition.

Volcano Milvus vs Open‑source Milvus

Conclusion: For both capacity‑oriented and performance‑oriented indexes, Volcano Milvus outperforms the open‑source version in performance and cost.

Capacity‑oriented Index

Performance is >5× the open‑source version

Per‑QPS price is 80% lower despite identical specifications and monthly price.

Performance‑oriented Index

QPS is 30% higher

Query node needs only 48 GB memory vs 192 GB , resulting in lower monthly price.

Per‑QPS price is 30% lower

What is DiskANN?

DiskANN, from the paper “Fast Accurate Billion‑point Nearest Neighbor Search on a Single Node”, moves the heavy graph index and raw vectors to SSD, keeping only compressed vectors and hot nodes in memory. It uses multi‑way disk reads and batch neighbor fetching to drastically reduce random reads.

Key designs:

Vamana graph : two‑round pruning yields a smaller diameter and more long edges, reducing hop count and latency.

Implicit re‑ranking : stores neighbors and full‑precision vectors in the same disk sector; during disk reads, both are fetched, enabling precise top‑K re‑ranking.

What is RaBitQ?

RaBitQ (Random Bit Quantization) is a state‑of‑the‑art vector quantization that maps each dimension to 1 bit on a unit sphere, achieving up to 32× compression for full‑precision vectors.

In high dimensions (e.g., 1024‑D) the quantization error becomes negligible, and distance computation can be performed with bit‑wise operations using AVX‑512, reducing a 1024‑D vector distance to two VPOPCNT instructions.

Memory usage reduced : 32× compression.

Computation faster : bit‑wise distance calculation.

DiskANN + RaBitQ in Practice

Disk mode : RaBitQ‑quantized vectors reside in memory; raw data and neighbors stay on disk. Search first uses quantized vectors for fast pruning, then refines with exact vectors from disk.

Performance mode : For memory‑rich, latency‑critical scenarios, both neighbors and full‑precision vectors are loaded into memory, eliminating repeated I/O.

How Volcano Milvus Uses DiskANN

Build: Index Node loads raw data from object storage, DiskANN builds the index and writes files to local storage, then uploads them back.

Load: Query Node downloads index files and loads selected data into memory based on mode.

Search: Query Node executes the DiskANN search algorithm.

Volcano Milvus Public Beta Announcement

Volcano Engine’s Milvus 10.22 public beta launches, offering cloud‑native, managed vector search for AI applications such as RAG, image retrieval, and recommendation, with multi‑region coverage, built‑in ops support, and dedicated technical onboarding.

Apply via the Volcano Engine website’s pre‑sales channel.

AIMilvusvector searchcost reductionRaBitQDiskANN
Volcano Engine Developer Services
Written by

Volcano Engine Developer Services

The Volcano Engine Developer Community, Volcano Engine's TOD community, connects the platform with developers, offering cutting-edge tech content and diverse events, nurturing a vibrant developer culture, and co-building an open-source ecosystem.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.