Backend Development 21 min read

Master Elasticsearch: From Basics to Advanced Performance Tuning

This article walks through Elasticsearch’s licensing history, version selection, installation, cluster health monitoring, shard routing, storage mechanisms, refresh and translog processes, segment merging, and practical performance optimizations such as disk choices, index settings, and JVM tuning.

Efficient Ops
Efficient Ops
Efficient Ops
Master Elasticsearch: From Basics to Advanced Performance Tuning

Elasticsearch Overview

On January 15, 2021, Elastic CEO Shay Bannon announced a change of the open‑source license for Elasticsearch and Kibana from Apache 2.0 to SSPL and the Elastic License. After three years, Elasticsearch and Kibana will return to open source with AGPL as an additional option alongside ELv2 and SSPL.

1. Basic Usage

When choosing a version, the commonly used stable major releases are 2.x, 5.x, 6.x, and 7.x (current). Versions 3.x and 4.x were skipped to keep the ELK stack (Elasticsearch, Logstash, Kibana) versioned consistently.

Elasticsearch is built with Java, so the JDK version must also match the Elasticsearch version; for example, 7.2 supports JDK 11.

Installation

Download and unzip Elasticsearch, then start it with

bin/elasticsearch

. It runs on port 9200 by default; accessing

http://localhost:9200

returns a JSON object with node, cluster, and version information.

<code>{
  "name" : "U7fp3O9",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "-Rj8jGQvRIelGd9ckicUOA",
  "version" : {
    "number" : "6.8.1",
    "build_flavor" : "default",
    "build_type" : "zip",
    "build_hash" : "1fad4e1",
    "build_date" : "2019-06-18T13:16:52.517138Z",
    "build_snapshot" : false,
    "lucene_version" : "7.7.0",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}</code>

Cluster Health

Cluster health can be checked via Kibana or APIs such as

GET/_cluster/health

, which returns a JSON status (green, yellow, red) and node statistics.

<code>{
  "cluster_name" : "lzj",
  "status" : "yellow",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 9,
  "active_shards" : 9,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 5,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 64.28571428571429
}</code>

Health colors:

Green : all primary and replica shards are active.

Yellow : all primary shards are active but at least one replica is not.

Red : some primary shards are missing, leading to potential data loss.

2. Elasticsearch Mechanisms

When creating an index you must decide the number of primary shards; this number never changes because shard routing depends on it.

Shard Routing

Routing determines the target primary shard using the formula:

<code>shard = hash(routing) % number_of_primary_shards</code>

The default routing value is the document _id , but it can be customized. The coordinating node calculates the target shard and forwards the request to the appropriate primary shard, which then replicates to its replicas.

Storage Model

Index data is stored in immutable segments on disk. Each segment is a self‑contained inverted index. Segments are written to a translog first, then flushed to disk as new segments.

<code>path.data: /path/to/data  // index data
path.logs: /path/to/logs  // log files</code>

Segments are never modified in place; deletions are recorded in a

.del

file, and updates are performed as delete‑plus‑add.

Refresh and Translog

Elasticsearch refreshes each shard roughly every second, making newly indexed documents searchable within a second. The refresh creates a new segment in the file‑system cache without writing to disk immediately.

Manual refresh can be triggered:

<code>POST/_refresh   // refresh all indices
POST/nba/_refresh   // refresh a specific index</code>
Tip: Frequent manual refreshes can impact performance; use them sparingly in production.

To avoid data loss, Elasticsearch writes every operation to a transaction log (translog) before it is persisted to a segment. When the translog reaches 512 MB or 30 minutes, a flush occurs, writing the in‑memory data to a new segment, syncing to disk, and clearing the translog.

Segment Merging

Because each refresh creates a new segment, the number of segments can grow quickly. Background merge processes combine small segments into larger ones, discarding deleted documents and reducing file‑handle and CPU overhead.

3. Performance Optimization

Storage Devices

Use SSDs for lower latency.

Prefer RAID10/RAID5 for better I/O and reliability.

Avoid remote mounts like NFS or SMB.

On cloud instances, be cautious with EBS performance.

Index Settings

Use sequential, compressible IDs instead of random UUIDs.

Disable doc values on fields that don’t need sorting or aggregations.

Prefer

keyword

over

text

for exact‑match fields.

Increase

index.refresh_interval

(e.g., to

30s

) if near‑real‑time visibility isn’t required.

During bulk loads, set

index.refresh_interval=-1

and

index.number_of_replicas=0

, then restore them after the load.

Use

scroll

for deep pagination instead of large

from+size

queries.

Specify routing values to target specific shards when possible.

JVM Tuning

Set the heap min (

-Xms

) and max (

-Xmx

) to the same value, not exceeding 50 % of physical RAM and 32 GB.

Consider using the G1 garbage collector instead of CMS.

Allocate sufficient memory for the filesystem cache to speed up searches.

PerformanceIndexingElasticsearchclusterSearch
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.