Boost RAG Accuracy with LangChain4j 1.11.0 Hybrid Search on PgVector

This guide explains why pure vector retrieval often fails for version‑specific queries, introduces hybrid search that combines semantic and keyword matching, and provides step‑by‑step code and SQL examples for enabling PgVector hybrid search in LangChain4j 1.11.0.

Java Architecture Diary
Java Architecture Diary
Java Architecture Diary
Boost RAG Accuracy with LangChain4j 1.11.0 Hybrid Search on PgVector

Retrieval‑Augmented Generation (RAG) systems often return irrelevant answers because the retrieval stage mismatches documents. Pure vector search, which ranks by semantic similarity, fails on exact identifiers (e.g., version numbers, product codes, CVE IDs) and on short queries, leading to wrong versions being returned.

Why Pure Vector Search Fails

Proper‑noun failure: Embeddings cannot reliably distinguish "Spring Boot 3.5" from "Spring Boot 2.7".

Over‑generalization: Queries like "Apple nutrition" may retrieve corporate financial reports about Apple Inc.

Short‑query issue: Exact identifiers such as "CVE-2024-38819" are usually missed.

Vector search excels at semantic similarity but not literal matching.

Hybrid Search: Combining Vector and Keyword Retrieval

Hybrid (mixed) search runs both vector similarity and traditional full‑text keyword search in parallel and merges the ranked results using a rank‑fusion algorithm such as Reciprocal Rank Fusion (RRF). The two legs differ in matching principle, handling of proper nouns, synonym support, typo tolerance, and model requirements.

LangChain4j 1.11.0 Adds PgVector Hybrid Search

Version 1.11.0 introduces a SearchMode enum with values VECTOR (default) and HYBRID. Enabling hybrid search requires only configuration changes; no extra search engine is needed.

Step‑by‑Step Setup

Enable hybrid mode when building the store:

PgVectorEmbeddingStore store = PgVectorEmbeddingStore.builder()
    .host("localhost")
    .port(5432)
    .database("mydb")
    .table("embeddings")
    .dimension(384)
    .searchMode(SearchMode.HYBRID) // enable hybrid
    .rrfK(60) // optional, default 60
    .textSearchConfig("simple") // optional, default simple
    .build();

Provide both embedding and raw text query:

EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
    .queryEmbedding(questionEmbedding) // vector part
    .query(question) // keyword part (mandatory in HYBRID)
    .maxResults(5)
    .build();
EmbeddingSearchResult result = store.search(request);

No further changes: Existing data and GIN indexes remain unchanged; only the new searchMode and query parameters are required.

Note: In VECTOR mode scores are cosine similarity in [0,1]; in HYBRID mode scores are RRF‑fused values (typically ~0.02–0.03 when k=60), so any score‑threshold logic must be adjusted.

Underlying SQL Implementation

The search() method routes to either embeddingOnlySearch or hybridSearch based on SearchMode. Hybrid search uses a Common Table Expression (CTE) with three parts:

Vector search ranking by cosine distance.

Keyword search using PostgreSQL full‑text search ( plainto_tsquery) with the configured text‑search dictionary.

FULL OUTER JOIN of both result sets and RRF calculation:

WITH vector_search AS (
  SELECT embedding_id, text, metadata,
         RANK() OVER (ORDER BY embedding <=> :referenceVector) AS rnk
  FROM embeddings
  ORDER BY embedding <=> :referenceVector
  LIMIT :candidateCount
),
keyword_search AS (
  SELECT embedding_id, text, metadata,
         RANK() OVER (ORDER BY ts_rank(to_tsvector(:config, coalesce(text,'')), plainto_tsquery(:config, :query)) DESC) AS rnk
  FROM embeddings
  WHERE to_tsvector(:config, coalesce(text,'')) @@ plainto_tsquery(:config, :query)
  ORDER BY ts_rank(...) DESC
  LIMIT :candidateCount
)
SELECT COALESCE(v.embedding_id, k.embedding_id) AS embedding_id,
       COALESCE(1.0/(:rrfK + v.rnk), 0.0) + COALESCE(1.0/(:rrfK + k.rnk), 0.0) AS score
FROM vector_search v
FULL OUTER JOIN keyword_search k ON v.embedding_id = k.embedding_id
WHERE score >= :minScore
ORDER BY score DESC
LIMIT :maxResults;

Key details:

Keyword search uses plainto_tsquery to avoid manual boolean operators.

FULL OUTER JOIN ensures documents appearing in only one leg still participate with a zero contribution from the other.

Each sub‑query’s LIMIT is Math.max(maxResults, rrfK) to provide enough candidates for fusion.

When HYBRID mode is active, a GIN index for full‑text search is created automatically via initTable().

if (searchMode == SearchMode.HYBRID) {
    String ftsIndexName = table + "_text_fts_gin_index";
    String query = String.format(
        "CREATE INDEX IF NOT EXISTS %s ON %s USING gin (to_tsvector('%s', coalesce(text, '')))",
        ftsIndexName, table, textSearchConfig);
    statement.executeUpdate(query);
}

Conclusion

When RAG systems return wrong versions, error codes, or product numbers, the bottleneck is often the retrieval stage. Enabling PgVector hybrid search in LangChain4j 1.11.0 provides a minimal‑change, high‑impact improvement that boosts recall of exact matches while preserving semantic recall. For higher precision, a reranker can be added after hybrid retrieval.

SQLRAGvector databaseFull-text searchLangchain4jpgvectorHybrid Search
Java Architecture Diary
Written by

Java Architecture Diary

Committed to sharing original, high‑quality technical articles; no fluff or promotional content.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.