Databases 16 min read

How to Benchmark Elasticsearch Clusters with Rally: A Step‑by‑Step Guide

This article explains why large‑scale Elasticsearch deployments need rigorous performance testing, compares available testing tools, walks through installing and configuring the official Rally benchmark suite, details hardware recommendations, shows how to run tests against multiple cloud providers, and teaches you how to interpret the resulting metrics to make informed cluster‑selection decisions.

DevOps Coach

Aug 13, 2020

How to Benchmark Elasticsearch Clusters with Rally: A Step‑by‑Step Guide

Why benchmark Elasticsearch at scale?

When an Elasticsearch deployment processes hundreds of gigabytes of data daily (e.g., website logs, user‑behavior records, e‑commerce search), a high‑performance cluster is essential. Benchmarking helps select appropriate hardware and cloud providers.

Tools for Elasticsearch performance testing

Rally – the official Elasticsearch benchmarking tool.

ESPerf – a Golang‑based tester.

Elasticsearch Stress Test – provided by Logz.io.

Rally is preferred because it supplies ready‑made data sets (tracks) and allows custom tracks.

Installing Rally

After installing JDK and Python, install Rally with: pip install rally Rally can also be run in a Docker container; see the official installation guide for details.

Key Rally concepts

race : a single benchmark execution.

car : the Elasticsearch cluster under test; different configurations are represented by different cars.

track : the data set used for a race.

challenge : a specific test scenario that defines the operations executed against Elasticsearch.

Hardware recommendations for the test

64 GB RAM is ideal; 32 GB or 16 GB is common, but less than 8 GB degrades performance.

More CPU cores are preferable to higher clock speed.

SSD storage dramatically improves indexing and query latency.

Mid‑range to high‑end machines are recommended; low‑end machines struggle with large clusters.

The experiment used four 8‑core, 64 GB RAM, 100 GB SSD instances from three cloud providers (Alibaba Cloud, Tencent Cloud, UCloud).

Running the benchmark

1. Build the cluster

Deploy three nodes, configure them as a healthy Elasticsearch cluster, and verify connectivity (see the official “Add and remove nodes” guide).

2. Execute the test command

esrally --track=pmc --target-hosts=10.5.5.10:9200,10.5.5.11:9200,10.5.5.12:9200 --pipeline=benchmark-only

To export results as CSV for easier analysis, add the reporting options:

esrally --track=pmc --target-hosts=10.5.5.10:9200,10.5.5.11:9200,10.5.5.12:9200 --pipeline=benchmark-only -report-format=csv -report-file=~/benchmarks/result.csv

3. Analyze the results

Rally outputs four columns: Metric , Task , Unit , and Result . Grouping by Task (e.g., index‑append, node‑stats, country_agg_uncached) allows comparison of specific workloads.

Interpreting key metrics

Indexing throughput and latency

Metrics such as Min/Median/Max throughput (ops/s) and latency percentiles (50th, 90th, 99th) indicate overall indexing speed. Higher throughput and lower latency are better. In the experiment UCloud achieved the shortest cumulative indexing time, roughly 24 % faster than Tencent Cloud and 16 % faster than Alibaba Cloud.

Merge time

Merge‑time metrics (min, median, max) should be as low as possible, while a higher merge count is desirable. UCloud again showed the lowest merge time and the highest merge count, indicating efficient background processing.

Node‑stats throughput & latency

Throughput is measured in ops/s (higher is better). Latency percentiles are measured in milliseconds (lower is better). All three providers had comparable throughput, but UCloud consistently delivered lower latency.

Aggregations (country_agg_uncached)

Throughput differences were minimal; latency varied, with UCloud showing the best numbers, followed by Alibaba Cloud, then Tencent Cloud.

Tips for smooth Rally execution

Pre‑download track data

Rally downloads track data from AWS S3, which can be slow or fail in some regions. Pre‑download the data to avoid network issues:

curl -O https://raw.githubusercontent.com/elastic/rally-tracks/master/download.sh
chmod u+x download.sh
./download.sh geonames
cd ~
tar -xf rally-track-data-geonames.tar

After extraction, run Rally with the local track files.

Conclusion

Before deploying Elasticsearch in production, use Rally to benchmark candidate clusters. In this experiment UCloud delivered the best overall performance‑to‑cost ratio, though real‑world results will depend on specific workloads. The methodology and scripts can be reused for further evaluations.

Appendix

Test data referenced in the article can be downloaded from: http://suo.im/5wq8CW

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Elasticsearch Performance Testing Data Analysis cloud benchmarking Rally

Written by

DevOps Coach

Master DevOps precisely and progressively.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.