Databases 27 min read

Why Vector Databases Matter: Deploying PgVector on PostgreSQL for Scalable AI Retrieval

This article explains the need for vector databases in the AI era, reviews PostgreSQL's extensible ecosystem, compares vector‑database options, provides step‑by‑step PgVector installation and usage, shares operational best practices, performance tuning tips, and real‑world Qunar & Tujia case studies.

ITPUB
ITPUB
ITPUB
Why Vector Databases Matter: Deploying PgVector on PostgreSQL for Scalable AI Retrieval

Why Vector Databases Are Needed

With the rise of large language models (LLMs), massive unstructured data must be stored and retrieved efficiently. Vector databases store embedding vectors and use similarity search to overcome LLM limitations such as token limits, hallucinations, and knowledge staleness.

PostgreSQL Ecosystem

PostgreSQL’s popularity is driven by its extensibility. Numerous extensions, including PgVector, add vector‑storage and retrieval capabilities while retaining all native PostgreSQL features.

Vector‑Database Landscape and Selection

Vector databases fall into two categories: dedicated vector stores (mostly NoSQL) and traditional DBMSs with vector extensions (SQL or NoSQL). PgVector was chosen for Qunar & Tujia because it combines PostgreSQL’s robustness with native vector support.

PgVector Installation and Basic Usage

Version Compatibility Matrix

PgVector 0.7.x – PostgreSQL 12‑17 – adds halfvec, sparsevec, bit types and speeds up HNSW index creation.

PgVector 0.6.x – PostgreSQL 12‑17 – parallel HNSW indexing, reduced memory and WAL usage.

PgVector 0.5.x – PostgreSQL 11‑16 – introduces HNSW index and l1_distance function.

PgVector 0.4.x – PostgreSQL 11‑15 – upgrades vector storage and adds avg function.

PgVector 0.3.x – PostgreSQL 10‑15 – bug fixes for IVFFlat.

Installation Steps (Linux/Mac)

cd /tmp</code>
<code>git clone --branch v0.7.2 https://github.com/pgvector/pgvector.git</code>
<code>cd pgvector</code>
<code>make</code>
<code>make install # may need sudo

Create Extension

postgres=# CREATE EXTENSION vector;</code>
<code>postgres=# \\dx vector

Create Table with Vector Column

CREATE TABLE my_img_emb (</code>
<code>  id BIGSERIAL PRIMARY KEY,</code>
<code>  img_uuid VARCHAR(64),</code>
<code>  img_name VARCHAR(256),</code>
<code>  img_type VARCHAR(64),</code>
<code>  img_embedding VECTOR(512),</code>
<code>  create_time TIMESTAMPTZ DEFAULT now() NOT NULL,</code>
<code>  update_time TIMESTAMPTZ DEFAULT now() NOT NULL</code>
<code>);</code>
<code>COMMENT ON TABLE my_img_emb IS '图片embedding表';</code>
<code>CREATE UNIQUE INDEX CONCURRENTLY ON my_img_emb(img_uuid);</code>
<code>CREATE INDEX CONCURRENTLY ON my_img_emb(img_type);</code>
<code>CREATE INDEX CONCURRENTLY ON my_img_emb USING hnsw (img_embedding vector_cosine_ops);

Similarity Search Example

SELECT img_uuid, img_name, img_type</code>
<code>FROM my_img_emb</code>
<code>ORDER BY img_embedding <=> '[0.12,0.34,...]' LIMIT 5;

Operational Practices

PgVector was first deployed in production in April 2023. Key observations include the need for regular index rebuilding for IVFFlat, the superior recall and QPS of HNSW, and memory considerations for large indexes.

Vector Types and Indexes

Supported vector types are vector, halfvec, bit, and sparsevec. HNSW indexes provide higher recall and speed but consume more memory; IVFFlat indexes are faster to build but require periodic rebuilding.

Performance Comparison

Benchmarks show that PgVector’s HNSW index outperforms IVFFlat in both QPS and recall across various workloads.

Version‑Upgrade Impact on HNSW

From version 0.5.0 to 0.7.2, index‑creation time dropped from ~19 minutes to ~2 minutes, while memory usage varied. Version 0.6.0 introduced parallel index creation and reduced WAL usage, but requires maintenance_work_mem to fit within /dev/shm space.

Query Optimization and Troubleshooting

When using vector indexes together with non‑vector filters, two strategies exist:

Pre‑filtering : Apply non‑vector conditions first, then perform similarity search. Guarantees top‑k results but costs more CPU.

Post‑filtering : Perform similarity search first, then filter results. Faster but may miss expected rows due to approximate search.

Example analysis of two SQL queries demonstrated that post‑filtering with an HNSW index returned fewer rows than expected, while pre‑filtering using a btree index on img_type returned the full set.

Improving Recall

Increase hnsw.ef_search (default 40, max 1000) to enlarge the candidate list.

Adjust index creation parameters m and ef_construction for better recall at the cost of build time and memory.

Rewrite queries to use similarity functions (e.g., cosine_distance(vector, vector)) instead of the <=> operator, forcing the planner to avoid the HNSW index when higher recall is needed.

Cluster‑Level Tuning

Vector workloads generate heavy WAL traffic. Setting full_page_writes = off and wal_compression = on reduces WAL size during bulk loads and backups. Monitoring shows a clear drop in WAL generation when these parameters are applied.

Real‑World Deployments at Qunar & Tujia

Four production use cases illustrate PgVector’s impact:

Image‑Search for Rentals : Users upload a photo and retrieve similar listings via vector similarity.

60‑Minute Customer Service Assistant : Combines LLMs with PgVector to handle routine queries, freeing agents for complex issues.

Travel Route Recommendation : Offline mining of popular itineraries stored as embeddings; online LLM‑driven chat recommends personalized routes.

Pre‑Sale Flight AI Assistant : Uses a static knowledge base plus LLM to answer booking questions.

These deployments have driven the creation of nearly ten PgVector clusters, with more projects planned for late 2024.

Conclusion and Outlook

The article covered the motivation for vector databases, PostgreSQL’s role, selection criteria, installation, operational best practices, performance tuning, query optimization, and concrete production case studies. It serves as a practical guide for DBAs and developers adopting PgVector to power AI‑enhanced applications.

RAG flow diagram
RAG flow diagram
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

SQLAIRAGvector databaseperformance tuningPostgreSQLpgvector
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.