Backend Development 11 min read

How to Optimize Elasticsearch Queries for Precise Enterprise Search Results

This article walks through the practical steps of improving Elasticsearch relevance for an enterprise search platform, covering user requirements, index creation, analysis, scoring models, boost and filter techniques, function_score customizations, and post‑query interventions to deliver more accurate and business‑aligned results.

Baidu Geek Talk

Sep 28, 2022

How to Optimize Elasticsearch Queries for Precise Enterprise Search Results

Introduction

The article presents a technical case study of how the "爱番番企业查询" platform uses Elasticsearch (ES) to deliver precise enterprise search results and details the optimisation process applied to meet user expectations.

User Requirements

High matching degree between query keywords and company name or legal representative.

Ability to retrieve enterprises that satisfy user‑specified conditions.

After matching, rank results by company health indicators such as operating status, registered capital and risk level.

Elasticsearch Fundamentals

ES is built on the Apache Lucene™ library and consists of two main parts: index creation and index querying.

Index Creation

Documents are tokenised, analysed and stored in an inverted index. The index contains:

Term dictionary – maps each term to its posting list.

Posting list – records which documents contain each term.

Analysis Process

Analysis transforms raw text into a stream of tokens using three components:

Character filter – removes HTML tags.

Tokenizer – splits text (e.g., whitespace for English).

Token filter – lower‑cases tokens, removes stop‑words, adds synonyms, etc.

The _analyze API can be used to inspect tokenisation results.

Scoring Model

ES computes a relevance score (_score) for each hit using a practical scoring function that combines TF/IDF, vector‑space concepts, coordination factor, field‑length normalisation and optional modern features.

By default results are sorted by descending _score.

Optimisation Techniques

1. Keyword Matching

Use match_phrase with a suitable slop value to enforce ordered token matching, and combine match and match_phrase to balance recall and precision.

2. Boosting Important Fields

Apply boost to the company name and legal representative fields, and use phrase matching for the legal representative to increase relevance.

3. Adding Filters

Introduce filter clauses (bool → filter) for user‑specified criteria such as operating status, registered capital, or risk flags. Filters narrow the candidate set without affecting scoring and benefit from ES caching.

4. Custom Scoring with function_score

Combine the original query score with additional factors using function_score. The article selects script_score for maximum flexibility, allowing custom scripts to read document fields (e.g., doc['field'].value) and compute a result_score as:

Original query score (query_score).

Custom function score (func_score).

Final score = query_score × func_score (default boost_mode).

5. Active Interventions

Keyword Extraction : Extract key terms (e.g., company name) into a dedicated mapping and apply phrase matching to boost relevance.

Secondary Sorting : After the primary relevance sort, re‑rank a subset of results based on business metrics (e.g., health score) to push the most valuable enterprises to the top.

Final Result

The combined approach—precise phrase matching, field boosting, filtered queries, custom script scoring, and post‑query re‑ranking—produces search results that better satisfy the three user requirements outlined earlier.

Conclusion

Search relevance optimisation is an iterative process that requires continuous monitoring, user‑feedback collection, and handling of edge cases. The described methods provide a practical roadmap for tailoring Elasticsearch to enterprise‑search scenarios.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Backend Development Elasticsearch Query DSL Relevance Scoring search optimization

Written by

Baidu Geek Talk

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.