Backend Development 13 min read

Elasticsearch Performance Optimization: General Recommendations, Indexing Speed, Search Speed, and Disk Usage

This guide presents comprehensive Elasticsearch tuning strategies—including limiting result sets, avoiding large documents, optimizing indexing with bulk requests and refresh intervals, enhancing search speed through field modeling and caching, and reducing disk usage via compression and shard management—to improve both indexing and query performance.

System Architect Go

Feb 9, 2019

General recommendations

Do not return large result sets; use the scroll API for deep pagination and consider increasing index.max_result_window if needed.

Avoid large documents by keeping them under http.max_content_length (default 100 MB) and splitting content into smaller units such as chapters or paragraphs.

Recipes

Mix exact search with stemming by indexing the same content as multi‑fields, or use quote_field_suffix with query_string / simple_query_string.

Ensure consistent scoring by routing queries with a fixed preference (e.g., user ID or session ID) to the same shard.

Address relevance issues caused by uneven routing, multiple indices, or low data volume; use a single shard ( index.number_of_shards: 1) or dfs_query_then_fetch when appropriate.

Tune for indexing speed

Use bulk requests, increase concurrency with multiple workers/threads, and raise index.refresh_interval (default 1 s) during heavy loads.

Temporarily disable refresh and replicas for initial loads ( index.refresh_interval: -1, index_number_of_replicas: 0), and disable OS swapping.

Allocate at least half of the host memory to the filesystem cache.

Prefer auto‑generated document IDs to avoid conflict checks.

Use faster hardware when possible.

Tune for search speed

Give more memory to the filesystem cache and use better hardware.

Model documents to avoid joins; nested and parent‑child queries are slower.

Search as few fields as possible; combine multiple fields with copy_to to a single searchable field.

Pre‑index data for range aggregations by adding a dedicated field (e.g., price_range).

Map identifiers as keyword when term queries are more common than range queries.

Avoid scripts; if needed, prefer the painless or expressions engines.

Search rounded dates (e.g., now-1h) to enable caching.

Force‑merge read‑only indices

Merge read‑only time‑based indices into a single segment for faster searches.

Warm up global ordinals with eager_global_ordinals: true and preload files using index.store.preload.

Use index sorting to speed up conjunctions, at the cost of slower indexing.

Apply preference to route repeated queries to the same shard, improving cache utilization.

Replicas can increase throughput but are not always beneficial; calculate an appropriate replica count with max(max_failures, ceil(num_nodes/num_primaries) - 1).

Enable adaptive replica selection for dynamic replica choice.

Tune for disk usage

Disable unnecessary features: set index: false for fields that do not need an inverted index, disable norms and limit index_options when scoring is not required.

Avoid the default dynamic string mapping that creates both text and keyword fields.

Keep shard sizes reasonable; larger shards store data more efficiently but take longer to recover.

Disable the deprecated _all field and consider disabling _source only with caution.

Use best_compression via index.codec for smaller disk footprints.

Force‑merge segments with _forcemerge (e.g., max_num_segments=1) and optionally shrink the index.

Select the smallest sufficient numeric type to reduce storage.

Enable index sorting to colocate similar documents, improving compression.

Maintain a consistent field order across documents to boost compression efficiency.

Overall, understanding and applying these optimizations helps achieve better Elasticsearch indexing and query performance while managing disk usage effectively.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

backend Indexing Search

Written by

System Architect Go

Programming, architecture, application development, message queues, middleware, databases, containerization, big data, image processing, machine learning, AI, personal growth.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.