Why No Single Algorithm Dominates Vector Search: A Deep Dive into Modern Vector DBs
The article surveys emerging vector databases, explains how various vector‑search algorithms such as FLAT, IVF, HNSW, DiskANN and ScaNN differ in accuracy, speed, memory use and build time, and provides practical guidance for choosing the right index based on data size, latency and resource constraints.
New vector databases are proliferating, but their rapid growth depends on a variety of vector‑search algorithms, and implementations differ in programming language and open‑source licensing.
Both on‑premises and cloud‑native deployments are mainstream, with some solutions offering high‑speed local small‑scale vector search.
Since the advent of vector search, enterprise retrieval has become a hybrid of keyword and vector queries, forming the core capability for Retrieval‑Augmented Generation (RAG).
Different data volumes and performance requirements force vector search to rely on diverse index designs and storage locations, constrained by the hardware speed hierarchy.
Core Algorithms
FLAT – Exhaustive linear scan of all vectors; 100% accurate, no index build cost, but query time grows linearly with data size, making it unsuitable for large datasets.
IVF FLAT – Uses K‑Means to partition the vector space into nlist Voronoi cells; at query time only the nearest nprobe cells are searched. This "partition pruning" speeds up search, while recall is controlled by nprobe.
IVF SQ8 – Builds on IVF by quantizing each dimension to an 8‑bit integer instead of a 32‑bit float, reducing memory by ~4× and accelerating computation, with a typical precision loss of <1%.
HNSW – Constructs a multi‑layer navigable small‑world graph: dense connections at the bottom layer, sparse "fast lanes" at upper layers. Queries start from the top entry point and greedily descend, achieving very high QPS and recall, but the graph consumes large memory and takes long to build.
Pros: extremely fast query speed and high recall; supports incremental updates.
Cons: high memory consumption (stores both raw vectors and graph); long index construction time.
Use case: latency‑critical real‑time search with sufficient memory budget.
DiskANN – Proposed by Microsoft Research. It stores a large graph index (Vamana) on SSD instead of RAM, keeps only compressed vectors (Product Quantization) in memory for candidate filtering, then re‑ranks with exact vectors read from disk. This breaks the memory bottleneck, enabling single‑machine handling of billions of vectors.
Pros: very high cost‑effectiveness; can process 10⁹‑scale data on a single machine with minimal RAM.
Cons: heavily dependent on SSD IOPS; search latency higher than HNSW.
Use case: massive (hundreds of millions to billions) datasets where low cost is a priority.
ScaNN – Developed by Google, combines Asymmetric Vector Quantization (AVQ), IVF partitioning, and fine‑grained re‑ranking. Its key innovation is to optimise quantisation for inner‑product similarity rather than uniform reconstruction error, yielding top rankings on ANN Benchmarks.
Pros: higher QPS than HNSW at the same recall level.
Cons: implementation complexity; primarily optimised for inner‑product searches.
Use case: workloads demanding extreme throughput, especially within the Google ecosystem.
Algorithm Selection Guidelines
Data size < 100 k, high precision → FLAT
Data size < 1 M, high precision → IVF FLAT
Data size < 1 M, memory‑sensitive → IVF SQ8 or ScaNN
Data size < 50 M, speed‑first → HNSW
Data size > 100 M, memory insufficient → DiskANN
Both QPS and recall must be extremely high → ScaNN
Performance Trade‑offs (relative ordering)
Recall (high to low): FLAT (100%) > HNSW ≈ IVF FLAT > DiskANN ≈ ScaNN > IVF SQ8
Query speed (QPS, fast to slow): HNSW > ScaNN > IVF SQ8 > IVF FLAT ≈ DiskANN > FLAT
Memory efficiency (low to high): DiskANN >> IVF SQ8 > ScaNN > IVF FLAT >> HNSW >> FLAT
Build time (fast to slow): FLAT >> IVF FLAT > IVF SQ8 > HNSW > ScaNN ≈ DiskANN
Additional Notes
Milvus lists various improved algorithms; for example, AISAQ extends DiskANN with a disk‑only index, handling billion‑scale datasets without exceeding RAM limits. Unlike the original DiskANN that keeps compressed vectors in memory, AISAQ stores everything on disk and offers two modes to balance performance and storage cost.
The core algorithms discussed are largely driven by research from Google and Microsoft, and the article encourages more domestic contributions to advance vector‑search technology.
References
https://journals.sagepub.com/doi/abs/10.1177/18724981251388888
https://arxiv.org/pdf/2402.01763
https://arxiv.org/pdf/2507.21939v1
https://thedataquarry.com/blog/vector-db-3/
https://zilliz.com.cn/blog/How-to-choose-the-most-suitable-vector-index-in-RAG-construction
https://arxiv.org/pdf/2601.07048
https://milvus.io/docs/zh/scann.md
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
AI2ML AI to Machine Learning
Original articles on artificial intelligence and machine learning, deep optimization. Less is more, life is simple! Shi Chunqi
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
