Understanding Vector Similarity Search: Flat Index, IVF, and HNSW
This article explains why vector databases are needed for semantic search of unstructured data and provides a detailed, step‑by‑step comparison of three core vector similarity algorithms—cosine similarity, Flat Index, IVF, and HNSW—highlighting their trade‑offs in accuracy and speed.
To understand why vector databases are needed, the article first contrasts relational databases handling structured data with the challenge of unstructured data such as emails, social posts, audio, etc., which require semantic similarity search.
Embeddings
Machine‑learning models convert unstructured content into dense numeric vectors (embeddings) that capture semantics.
Vector Similarity Search
The article lists four common similarity‑search algorithms: cosine similarity, Flat Index, Inverted File Index (IVF), and Hierarchical Navigable Small World (HNSW). It explains cosine similarity range and gives a one‑hot example with two sentences, showing dot‑product, magnitude, and resulting similarity score.
Flat Index (Brute‑Force Search)
Flat Index compares the query vector with every vector in the dataset, computes cosine similarity, sorts results, and returns the top‑K. The process is enumerated in a step‑by‑step list:
Take the query vector.
Initialize an empty result list.
Iterate over each vector in the dataset.
Compute cosine similarity between the query and the current vector.
Store (vector, similarity score) in the result list.
Sort results by similarity descending.
Return the top‑K most similar vectors.
While it guarantees 100 % recall, it is computationally expensive for large collections (e.g., 1 million vectors).
Inverted File Index (IVF)
IVF partitions the vector space into clusters using k‑means (e.g., 1 M vectors → 100 clusters). The workflow has four stages:
Clustering: apply k‑means to create 100 centroids.
Build inverted lists: each centroid C_i stores a list of vectors belonging to that cluster (e.g., C1 → [v1, v2, …]).
Locate nearest clusters: compute distance from the query Q to all centroids and select the closest nprobe clusters (e.g., nprobe = 5).
Search within selected clusters: only the vectors in those clusters (≈5 × 10 000 = 50 000 vectors) are examined, reducing the search space by a factor of 20.
This yields a controllable trade‑off: a small loss in recall for a magnitude speed improvement.
Skip List
A skip list is a probabilistic data structure offering O(log n) search, insertion, and deletion by adding multiple shortcut layers on top of an ordered linked list.
Navigable Small World (NSW)
NSW is a graph‑based search method: starting from an entry node, each step moves to a neighbor that is closer to the query, using local shortcuts to quickly approach the target. However, it can suffer from the “early‑stop” problem where the search gets trapped in a local optimum.
Hierarchical Navigable Small World (HNSW)
HNSW combines the layered shortcut idea of a skip list with NSW’s greedy traversal. The top layer is sparse for coarse positioning; lower layers are dense for fine‑grained search.
Start from the top‑layer entry point.
Examine all neighbors of the current node.
Move to the neighbor closest to the query.
Repeat until no neighbor is closer (local minimum).
Descend one layer, using the current node as the new start.
Repeat steps 1‑5 until reaching layer 0.
At layer 0, expand the search with the ef (search width) parameter to collect candidates.
Return the top‑K results from the candidate set.
This hierarchical process first performs a coarse search with few nodes, then refines the region layer by layer, and finally conducts a dense search at the bottom. The high‑level jumps avoid NSW’s local‑optimum trap, while the dense bottom layer ensures result quality.
Summary
Flat Index provides exact recall but scales linearly with data size, making it suitable only for small datasets or scenarios demanding 100 % recall. IVF offers a configurable balance: clustering reduces the search space dramatically, trading a modest recall loss for orders‑of‑magnitude speed gains. HNSW follows a different path by building a multi‑layer graph; its sparse upper layers enable rapid coarse localization, and its dense lower layers deliver high‑quality results, satisfying both low‑latency and high‑recall requirements.
In practice, Flat Index is appropriate for collections under ~10 k vectors; IVF is cost‑effective for million‑scale data; and HNSW is the preferred choice for latency‑critical online services. Major vector databases such as Milvus, Qdrant, and Weaviate support all three index types, allowing users to switch based on workload needs.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
DeepHub IMBA
A must‑follow public account sharing practical AI insights. Follow now. internet + machine learning + big data + architecture = IMBA
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
