Operations 16 min read

ESrally Guide: Install, Configure, and Benchmark Elasticsearch Performance

ESrally is the official Elasticsearch benchmarking tool; this guide walks through its installation prerequisites, step‑by‑step setup of Python, JDK, and Git, configuration of tracks, cars, pipelines, and challenges, and demonstrates real‑world performance comparisons across Elasticsearch versions and hardware platforms.

Ops Development Stories
Ops Development Stories
Ops Development Stories
ESrally Guide: Install, Configure, and Benchmark Elasticsearch Performance

ESrally Introduction

esrally is the official Elasticsearch tool for stress‑testing ES clusters. It allows building clusters of different versions, configuring parameters and data, running benchmarks, and comparing results. The name refers to a rally race.

ESrally Installation

Environment Requirements

python3.8 pip3
jdk8
git 1.9+

Python 3.8 Installation

yum install zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel readline-devel tk-devel gcc make libffi-devel
wget https://www.python.org/ftp/python/3.8.2/Python-3.8.2.tar.xz
tar -xvJf Python-3.8.2.tar.xz
mkdir /usr/local/python3
cd Python-3.8.2/
./configure --prefix=/usr/local/python3
make && make install
ln -s /usr/local/python3/bin/python3 /usr/local/bin/python3
ln -s /usr/local/python3/bin/pip3 /usr/local/bin/pip3

Git 2.22 Installation

Because the default yum git version is 1.8, compile a newer version:

yum install curl-devel expat-devel gettext-devel openssl-devel zlib-devel gcc perl-ExtUtils-MakeMaker
cd /tmp
wget https://mirrors.edge.kernel.org/pub/software/scm/git/git-2.22.0.tar.gz
tar xzf git-2.22.0.tar.gz
cd git-2.22.0
make prefix=/usr/local/git all
make prefix=/usr/local/git install
echo "export PATH=$PATH:/usr/local/git/bin" >> /etc/bashrc
source /etc/bashrc

JDK Installation

rpm -ivh jdk-8u221-linux-x64.rpm

ESrally Installation

python3 -m pip install esrally
vim /etc/profile
JAVA_HOME=/usr/java/jdk1.8.0_221-amd64/jre
export PATH=$PATH:/usr/local/python3/bin/:/usr/local/git/bin JAVA_HOME
source /etc/profile

Configure ESrally

Run esrally configure to define data locations and store the configuration in /root/.rally/rally.ini.

ESrally Terminology

track : a data set and workload definition (e.g., geonames/track.json).

{% import "rally.helpers" as rally with context %}
{
  "version": 2,
  "description": "POIs from Geonames",
  "data-url": "http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/geonames",
  "indices": [{"name": "geonames", "body": "index.json"}],
  "corpora": [{"name": "geonames", "base-url": "http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/geonames", "documents": [{"source-file": "documents-2.json.bz2", "document-count": 11396503, "compressed-bytes": 265208777, "uncompressed-bytes": 3547613828}]}],
  "operations": [{{ rally.collect(parts="operations/*.json") }}],
  "challenges": [{{ rally.collect(parts="challenges/*.json") }}]
}

The track defines index settings, mappings, and data sources. When a race starts, ESrally downloads data from the specified URLs (or uses pre‑downloaded offline data).

Operations

Operations describe the actions performed during a benchmark, such as bulk indexing, updates, searches, aggregations, and custom scripts.

{
  "name": "index-append",
  "operation-type": "bulk",
  "bulk-size": {{bulk_size | default(5000)}},
  "ingest-percentage": {{ingest_percentage | default(100)}}
},
{
  "name": "term",
  "operation-type": "search",
  "body": {"query": {"term": {"country_code.raw": "AT"}}}
}

Challenges

A challenge is a sequence of tasks executed in a race. Example:

{
  "name": "append-no-conflicts",
  "description": "Indexes the whole document corpus using default settings...",
  "default": true,
  "schedule": [
    {"operation": "delete-index"},
    {"operation": {"operation-type": "create-index", "settings": {{index_settings | default({}) | tojson}}}},
    {"name": "check-cluster-health", "operation": {"operation-type": "cluster-health", "index": "geonames", "request-params": {"wait_for_status": "green", "wait_for_no_relocating_shards": "true"}}},
    {"operation": "index-append", "warmup-time-period": 120, "clients": {{bulk_indexing_clients | default(8)}}}
  ]
}

Cars

Cars define ES instance configurations (heap size, GC, etc.). List them with esrally list car. Example configuration files reside in /home/elk/.rally/benchmarks/teams/default/cars/v1.

Races

A race runs a specific track with a chosen car and pipeline. Results are stored in /home/elk/.rally/benchmarks/races and can be compared using esrally compare.

esrally race --distribution-version=5.4.3 --track=geonames --user-tag "version:5.4.3" --include-tasks "type:bulk"
esrally race --distribution-version=6.4.3 --track=geonames --user-tag "version:6.4.3" --include-tasks "type:bulk"
esrally race --distribution-version=7.8.1 --track=geonames --user-tag "version:7.8.1" --include-tasks "type:bulk"

Comparing Results

esrally compare --baseline=27265e6e-566a-4a47-a0d9-1fd2f8830041 --contender=66086ef0-5834-4743-a870-fd9c0bb41688

Example output shows significant write‑performance differences between Elasticsearch 5.4 and 7.8.

Pipelines

Pipelines define how a cluster is provisioned for a race (e.g., from source, from distribution, benchmark‑only). List them with esrally list pipelines.

from-sources-complete: build ES from source.

from-sources-skip-build: reuse previously built binaries.

from-distribution: download official ES distribution.

benchmark-only: test an existing ES cluster.

Cross‑Platform Benchmark

Example comparing x86_64 and ARM platforms using the benchmark-only pipeline:

esrally race --pipeline=benchmark-only --target-hosts=172.16.0.95:9200 --track=http_logs --offline
esrally race --pipeline=benchmark-only --target-hosts=172.26.214.32:9200 --track=http_logs --offline

During testing, a common issue is insufficient memory causing the race to abort; stopping the previously started ES instance resolves it.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

OperationsElasticsearchPerformance TestingBenchmarkingESrally
Ops Development Stories
Written by

Ops Development Stories

Maintained by a like‑minded team, covering both operations and development. Topics span Linux ops, DevOps toolchain, Kubernetes containerization, monitoring, log collection, network security, and Python or Go development. Team members: Qiao Ke, wanger, Dong Ge, Su Xin, Hua Zai, Zheng Ge, Teacher Xia.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.