Why No Single Algorithm Dominates Vector Search: A Deep Dive into Modern Vector DBs

The article surveys emerging vector databases, explains how various vector‑search algorithms such as FLAT, IVF, HNSW, DiskANN and ScaNN differ in accuracy, speed, memory use and build time, and provides practical guidance for choosing the right index based on data size, latency and resource constraints.

AI2ML AI to Machine Learning
AI2ML AI to Machine Learning
AI2ML AI to Machine Learning
Why No Single Algorithm Dominates Vector Search: A Deep Dive into Modern Vector DBs

New vector databases are proliferating, but their rapid growth depends on a variety of vector‑search algorithms, and implementations differ in programming language and open‑source licensing.

Both on‑premises and cloud‑native deployments are mainstream, with some solutions offering high‑speed local small‑scale vector search.

Since the advent of vector search, enterprise retrieval has become a hybrid of keyword and vector queries, forming the core capability for Retrieval‑Augmented Generation (RAG).

Different data volumes and performance requirements force vector search to rely on diverse index designs and storage locations, constrained by the hardware speed hierarchy.

Core Algorithms

FLAT – Exhaustive linear scan of all vectors; 100% accurate, no index build cost, but query time grows linearly with data size, making it unsuitable for large datasets.

IVF FLAT – Uses K‑Means to partition the vector space into nlist Voronoi cells; at query time only the nearest nprobe cells are searched. This "partition pruning" speeds up search, while recall is controlled by nprobe.

IVF SQ8 – Builds on IVF by quantizing each dimension to an 8‑bit integer instead of a 32‑bit float, reducing memory by ~4× and accelerating computation, with a typical precision loss of <1%.

HNSW – Constructs a multi‑layer navigable small‑world graph: dense connections at the bottom layer, sparse "fast lanes" at upper layers. Queries start from the top entry point and greedily descend, achieving very high QPS and recall, but the graph consumes large memory and takes long to build.

Pros: extremely fast query speed and high recall; supports incremental updates.

Cons: high memory consumption (stores both raw vectors and graph); long index construction time.

Use case: latency‑critical real‑time search with sufficient memory budget.

DiskANN – Proposed by Microsoft Research. It stores a large graph index (Vamana) on SSD instead of RAM, keeps only compressed vectors (Product Quantization) in memory for candidate filtering, then re‑ranks with exact vectors read from disk. This breaks the memory bottleneck, enabling single‑machine handling of billions of vectors.

Pros: very high cost‑effectiveness; can process 10⁹‑scale data on a single machine with minimal RAM.

Cons: heavily dependent on SSD IOPS; search latency higher than HNSW.

Use case: massive (hundreds of millions to billions) datasets where low cost is a priority.

ScaNN – Developed by Google, combines Asymmetric Vector Quantization (AVQ), IVF partitioning, and fine‑grained re‑ranking. Its key innovation is to optimise quantisation for inner‑product similarity rather than uniform reconstruction error, yielding top rankings on ANN Benchmarks.

Pros: higher QPS than HNSW at the same recall level.

Cons: implementation complexity; primarily optimised for inner‑product searches.

Use case: workloads demanding extreme throughput, especially within the Google ecosystem.

Algorithm Selection Guidelines

Data size < 100 k, high precision → FLAT

Data size < 1 M, high precision → IVF FLAT

Data size < 1 M, memory‑sensitive → IVF SQ8 or ScaNN

Data size < 50 M, speed‑first → HNSW

Data size > 100 M, memory insufficient → DiskANN

Both QPS and recall must be extremely high → ScaNN

Performance Trade‑offs (relative ordering)

Recall (high to low): FLAT (100%) > HNSW ≈ IVF FLAT > DiskANN ≈ ScaNN > IVF SQ8

Query speed (QPS, fast to slow): HNSW > ScaNN > IVF SQ8 > IVF FLAT ≈ DiskANN > FLAT

Memory efficiency (low to high): DiskANN >> IVF SQ8 > ScaNN > IVF FLAT >> HNSW >> FLAT

Build time (fast to slow): FLAT >> IVF FLAT > IVF SQ8 > HNSW > ScaNN ≈ DiskANN

Additional Notes

Milvus lists various improved algorithms; for example, AISAQ extends DiskANN with a disk‑only index, handling billion‑scale datasets without exceeding RAM limits. Unlike the original DiskANN that keeps compressed vectors in memory, AISAQ stores everything on disk and offers two modes to balance performance and storage cost.

The core algorithms discussed are largely driven by research from Google and Microsoft, and the article encourages more domestic contributions to advance vector‑search technology.

References

https://journals.sagepub.com/doi/abs/10.1177/18724981251388888

https://arxiv.org/pdf/2402.01763

https://arxiv.org/pdf/2507.21939v1

https://thedataquarry.com/blog/vector-db-3/

https://zilliz.com.cn/blog/How-to-choose-the-most-suitable-vector-index-in-RAG-construction

https://arxiv.org/pdf/2601.07048

https://milvus.io/docs/zh/scann.md

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Vector DatabaseVector SearchHNSWapproximate nearest neighborScaNNDiskANN
AI2ML AI to Machine Learning
Written by

AI2ML AI to Machine Learning

Original articles on artificial intelligence and machine learning, deep optimization. Less is more, life is simple! Shi Chunqi

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.