Understanding Vector Similarity Search: Flat Index, IVF, and HNSW

This article explains why vector databases are needed for semantic search of unstructured data and provides a detailed, step‑by‑step comparison of three core vector similarity algorithms—cosine similarity, Flat Index, IVF, and HNSW—highlighting their trade‑offs in accuracy and speed.

DeepHub IMBA
DeepHub IMBA
DeepHub IMBA
Understanding Vector Similarity Search: Flat Index, IVF, and HNSW

To understand why vector databases are needed, the article first contrasts relational databases handling structured data with the challenge of unstructured data such as emails, social posts, audio, etc., which require semantic similarity search.

Embeddings

Machine‑learning models convert unstructured content into dense numeric vectors (embeddings) that capture semantics.

Vector Similarity Search

The article lists four common similarity‑search algorithms: cosine similarity, Flat Index, Inverted File Index (IVF), and Hierarchical Navigable Small World (HNSW). It explains cosine similarity range and gives a one‑hot example with two sentences, showing dot‑product, magnitude, and resulting similarity score.

Flat Index (Brute‑Force Search)

Flat Index compares the query vector with every vector in the dataset, computes cosine similarity, sorts results, and returns the top‑K. The process is enumerated in a step‑by‑step list:

Take the query vector.

Initialize an empty result list.

Iterate over each vector in the dataset.

Compute cosine similarity between the query and the current vector.

Store (vector, similarity score) in the result list.

Sort results by similarity descending.

Return the top‑K most similar vectors.

While it guarantees 100 % recall, it is computationally expensive for large collections (e.g., 1 million vectors).

Inverted File Index (IVF)

IVF partitions the vector space into clusters using k‑means (e.g., 1 M vectors → 100 clusters). The workflow has four stages:

Clustering: apply k‑means to create 100 centroids.

Build inverted lists: each centroid C_i stores a list of vectors belonging to that cluster (e.g., C1 → [v1, v2, …]).

Locate nearest clusters: compute distance from the query Q to all centroids and select the closest nprobe clusters (e.g., nprobe = 5).

Search within selected clusters: only the vectors in those clusters (≈5 × 10 000 = 50 000 vectors) are examined, reducing the search space by a factor of 20.

This yields a controllable trade‑off: a small loss in recall for a magnitude speed improvement.

Skip List

A skip list is a probabilistic data structure offering O(log n) search, insertion, and deletion by adding multiple shortcut layers on top of an ordered linked list.

Navigable Small World (NSW)

NSW is a graph‑based search method: starting from an entry node, each step moves to a neighbor that is closer to the query, using local shortcuts to quickly approach the target. However, it can suffer from the “early‑stop” problem where the search gets trapped in a local optimum.

Hierarchical Navigable Small World (HNSW)

HNSW combines the layered shortcut idea of a skip list with NSW’s greedy traversal. The top layer is sparse for coarse positioning; lower layers are dense for fine‑grained search.

Start from the top‑layer entry point.

Examine all neighbors of the current node.

Move to the neighbor closest to the query.

Repeat until no neighbor is closer (local minimum).

Descend one layer, using the current node as the new start.

Repeat steps 1‑5 until reaching layer 0.

At layer 0, expand the search with the ef (search width) parameter to collect candidates.

Return the top‑K results from the candidate set.

This hierarchical process first performs a coarse search with few nodes, then refines the region layer by layer, and finally conducts a dense search at the bottom. The high‑level jumps avoid NSW’s local‑optimum trap, while the dense bottom layer ensures result quality.

Summary

Flat Index provides exact recall but scales linearly with data size, making it suitable only for small datasets or scenarios demanding 100 % recall. IVF offers a configurable balance: clustering reduces the search space dramatically, trading a modest recall loss for orders‑of‑magnitude speed gains. HNSW follows a different path by building a multi‑layer graph; its sparse upper layers enable rapid coarse localization, and its dense lower layers deliver high‑quality results, satisfying both low‑latency and high‑recall requirements.

In practice, Flat Index is appropriate for collections under ~10 k vectors; IVF is cost‑effective for million‑scale data; and HNSW is the preferred choice for latency‑critical online services. Major vector databases such as Milvus, Qdrant, and Weaviate support all three index types, allowing users to switch based on workload needs.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Vector SearchHNSWnearest-neighborembeddingsIVFflat index
DeepHub IMBA
Written by

DeepHub IMBA

A must‑follow public account sharing practical AI insights. Follow now. internet + machine learning + big data + architecture = IMBA

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.