Boost RAG Accuracy with LangChain4j 1.11.0 Hybrid Search on PgVector
This guide explains why pure vector retrieval often fails for version‑specific queries, introduces hybrid search that combines semantic and keyword matching, and provides step‑by‑step code and SQL examples for enabling PgVector hybrid search in LangChain4j 1.11.0.
Retrieval‑Augmented Generation (RAG) systems often return irrelevant answers because the retrieval stage mismatches documents. Pure vector search, which ranks by semantic similarity, fails on exact identifiers (e.g., version numbers, product codes, CVE IDs) and on short queries, leading to wrong versions being returned.
Why Pure Vector Search Fails
Proper‑noun failure: Embeddings cannot reliably distinguish "Spring Boot 3.5" from "Spring Boot 2.7".
Over‑generalization: Queries like "Apple nutrition" may retrieve corporate financial reports about Apple Inc.
Short‑query issue: Exact identifiers such as "CVE-2024-38819" are usually missed.
Vector search excels at semantic similarity but not literal matching.
Hybrid Search: Combining Vector and Keyword Retrieval
Hybrid (mixed) search runs both vector similarity and traditional full‑text keyword search in parallel and merges the ranked results using a rank‑fusion algorithm such as Reciprocal Rank Fusion (RRF). The two legs differ in matching principle, handling of proper nouns, synonym support, typo tolerance, and model requirements.
LangChain4j 1.11.0 Adds PgVector Hybrid Search
Version 1.11.0 introduces a SearchMode enum with values VECTOR (default) and HYBRID. Enabling hybrid search requires only configuration changes; no extra search engine is needed.
Step‑by‑Step Setup
Enable hybrid mode when building the store:
PgVectorEmbeddingStore store = PgVectorEmbeddingStore.builder()
.host("localhost")
.port(5432)
.database("mydb")
.table("embeddings")
.dimension(384)
.searchMode(SearchMode.HYBRID) // enable hybrid
.rrfK(60) // optional, default 60
.textSearchConfig("simple") // optional, default simple
.build();Provide both embedding and raw text query:
EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
.queryEmbedding(questionEmbedding) // vector part
.query(question) // keyword part (mandatory in HYBRID)
.maxResults(5)
.build();
EmbeddingSearchResult result = store.search(request);No further changes: Existing data and GIN indexes remain unchanged; only the new searchMode and query parameters are required.
Note: In VECTOR mode scores are cosine similarity in [0,1]; in HYBRID mode scores are RRF‑fused values (typically ~0.02–0.03 when k=60), so any score‑threshold logic must be adjusted.
Underlying SQL Implementation
The search() method routes to either embeddingOnlySearch or hybridSearch based on SearchMode. Hybrid search uses a Common Table Expression (CTE) with three parts:
Vector search ranking by cosine distance.
Keyword search using PostgreSQL full‑text search ( plainto_tsquery) with the configured text‑search dictionary.
FULL OUTER JOIN of both result sets and RRF calculation:
WITH vector_search AS (
SELECT embedding_id, text, metadata,
RANK() OVER (ORDER BY embedding <=> :referenceVector) AS rnk
FROM embeddings
ORDER BY embedding <=> :referenceVector
LIMIT :candidateCount
),
keyword_search AS (
SELECT embedding_id, text, metadata,
RANK() OVER (ORDER BY ts_rank(to_tsvector(:config, coalesce(text,'')), plainto_tsquery(:config, :query)) DESC) AS rnk
FROM embeddings
WHERE to_tsvector(:config, coalesce(text,'')) @@ plainto_tsquery(:config, :query)
ORDER BY ts_rank(...) DESC
LIMIT :candidateCount
)
SELECT COALESCE(v.embedding_id, k.embedding_id) AS embedding_id,
COALESCE(1.0/(:rrfK + v.rnk), 0.0) + COALESCE(1.0/(:rrfK + k.rnk), 0.0) AS score
FROM vector_search v
FULL OUTER JOIN keyword_search k ON v.embedding_id = k.embedding_id
WHERE score >= :minScore
ORDER BY score DESC
LIMIT :maxResults;Key details:
Keyword search uses plainto_tsquery to avoid manual boolean operators.
FULL OUTER JOIN ensures documents appearing in only one leg still participate with a zero contribution from the other.
Each sub‑query’s LIMIT is Math.max(maxResults, rrfK) to provide enough candidates for fusion.
When HYBRID mode is active, a GIN index for full‑text search is created automatically via initTable().
if (searchMode == SearchMode.HYBRID) {
String ftsIndexName = table + "_text_fts_gin_index";
String query = String.format(
"CREATE INDEX IF NOT EXISTS %s ON %s USING gin (to_tsvector('%s', coalesce(text, '')))",
ftsIndexName, table, textSearchConfig);
statement.executeUpdate(query);
}Conclusion
When RAG systems return wrong versions, error codes, or product numbers, the bottleneck is often the retrieval stage. Enabling PgVector hybrid search in LangChain4j 1.11.0 provides a minimal‑change, high‑impact improvement that boosts recall of exact matches while preserving semantic recall. For higher precision, a reranker can be added after hybrid retrieval.
Java Architecture Diary
Committed to sharing original, high‑quality technical articles; no fluff or promotional content.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
