Enterprise Semantic Search: Key Q&A on Scoring, Recall, LSH, Chunking, and Embedding Dimensions

This article answers practical questions about enterprise semantic search, explaining how Reciprocal Rank Fusion normalizes mixed scoring, how to control vector result size, the trade‑offs of LSH parameters, word‑ and sentence‑based chunking strategies with version‑specific defaults, and flexible embedding dimensionality.

Mingyi World Elasticsearch
Mingyi World Elasticsearch
Mingyi World Elasticsearch
Enterprise Semantic Search: Key Q&A on Scoring, Recall, LSH, Chunking, and Embedding Dimensions

Scoring Normalization with RRF

Vector scores typically lie in the 0‑1 range, while keyword scores (e.g., TF‑IDF, BM25) can be unbounded, making direct weighting meaningless. The core issue is the inconsistent scoring dimensions of hybrid search. Reciprocal Rank Fusion (RRF), introduced as a paid feature in Elasticsearch 8.9, solves this by using a "ranking democracy" mechanism that requires no tuning and works across unrelated relevance indicators.

Recall and Result Size Controls

When a vector search returns the total number of hits, the desired top‑N results can be limited by setting size: 10, which restricts the final output to ten documents. However, the semantic query may also specify candidates: 50, causing the vector stage to retrieve fifty candidates before merging with keyword results, potentially exceeding the ten‑result limit. Aligning candidates with size improves efficiency, and increasing candidates while keeping size unchanged can raise result quality.

LSH Parameter Guidance

LSH (Locality‑Sensitive Hashing) uses two key parameters:

L : the number of hash tables; increasing L improves recall but adds storage cost and query latency.

k : the number of hash functions per table; increasing k improves precision but reduces recall and raises computation cost.

Chunking Large Documents for Vectorization

Long source fields degrade embedding accuracy and exceed model token limits. The solution is to split documents into smaller chunks and embed each chunk separately.

Word‑based Chunking

max_chunk_size : maximum number of words per chunk (required).

overlap : number of overlapping words between consecutive chunks (required, ≤ ½ max_chunk_size).

Mechanism: fill a chunk to the maximum size, then start the next chunk, overlapping the specified word count to preserve context.

Sentence‑based Chunking

max_chunk_size : maximum number of words per chunk (required).

sentence_overlap : number of overlapping sentences between chunks (required, 0 or 1).

Mechanism: split input into blocks that contain complete sentences; each block (except the first) shares the overlapping sentences with the previous block, prioritizing sentence integrity over full block fill.

Default settings changed after Elasticsearch 8.16:

Post‑8.16: strategy = sentence chunking, max_chunk_size = 250, sentence_overlap = 1.

Pre‑8.16: strategy = word chunking, max_chunk_size = 250, overlap = 1.

Embedding Dimensionality

The models nomic‑embed‑text‑v1 and nomic‑embed‑text‑v1.5 default to 768 dimensions. Using Matryoshka Representation Learning, these models support flexible dimensions ranging from 64 to 768, allowing users to choose 256 or 512 to reduce storage and compute costs with minimal performance loss.

References:

Elasticsearch RRF documentation: https://www.elastic.co/docs/reference/elasticsearch/rest-apis/reciprocal-rank-fusion

Elastic chunking blog: https://www.elastic.co/search-labs/blog/elasticsearch-chunking-inference-api-endpoints

LSH overview: https://medium.com/@sarthakjoshi_9398/understanding-locality-sensitive-hashing-lsh-a-powerful-technique-for-similarity-search-a95b090bdc4a

Elasticsearch 8.16 release notes: https://discuss.elastic.co/t/what-s-new-in-elastic-8-16/370418

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Elasticsearchvector searchLSHsemantic searchchunkingRRFembedding dimensions
Mingyi World Elasticsearch
Written by

Mingyi World Elasticsearch

The leading WeChat public account for Elasticsearch fundamentals, advanced topics, and hands‑on practice. Join us to dive deep into the ELK Stack (Elasticsearch, Logstash, Kibana, Beats).

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.