How to Benchmark Elasticsearch Clusters with Rally: A Step‑by‑Step Guide
This article explains why large‑scale Elasticsearch deployments need rigorous performance testing, compares available testing tools, walks through installing and configuring the official Rally benchmark suite, details hardware recommendations, shows how to run tests against multiple cloud providers, and teaches you how to interpret the resulting metrics to make informed cluster‑selection decisions.
Why benchmark Elasticsearch at scale?
When an Elasticsearch deployment processes hundreds of gigabytes of data daily (e.g., website logs, user‑behavior records, e‑commerce search), a high‑performance cluster is essential. Benchmarking helps select appropriate hardware and cloud providers.
Tools for Elasticsearch performance testing
Rally – the official Elasticsearch benchmarking tool.
ESPerf – a Golang‑based tester.
Elasticsearch Stress Test – provided by Logz.io.
Rally is preferred because it supplies ready‑made data sets (tracks) and allows custom tracks.
Installing Rally
After installing JDK and Python, install Rally with: pip install rally Rally can also be run in a Docker container; see the official installation guide for details.
Key Rally concepts
race : a single benchmark execution.
car : the Elasticsearch cluster under test; different configurations are represented by different cars.
track : the data set used for a race.
challenge : a specific test scenario that defines the operations executed against Elasticsearch.
Hardware recommendations for the test
64 GB RAM is ideal; 32 GB or 16 GB is common, but less than 8 GB degrades performance.
More CPU cores are preferable to higher clock speed.
SSD storage dramatically improves indexing and query latency.
Mid‑range to high‑end machines are recommended; low‑end machines struggle with large clusters.
The experiment used four 8‑core, 64 GB RAM, 100 GB SSD instances from three cloud providers (Alibaba Cloud, Tencent Cloud, UCloud).
Running the benchmark
1. Build the cluster
Deploy three nodes, configure them as a healthy Elasticsearch cluster, and verify connectivity (see the official “Add and remove nodes” guide).
2. Execute the test command
esrally --track=pmc --target-hosts=10.5.5.10:9200,10.5.5.11:9200,10.5.5.12:9200 --pipeline=benchmark-onlyTo export results as CSV for easier analysis, add the reporting options:
esrally --track=pmc --target-hosts=10.5.5.10:9200,10.5.5.11:9200,10.5.5.12:9200 --pipeline=benchmark-only -report-format=csv -report-file=~/benchmarks/result.csv3. Analyze the results
Rally outputs four columns: Metric , Task , Unit , and Result . Grouping by Task (e.g., index‑append, node‑stats, country_agg_uncached) allows comparison of specific workloads.
Interpreting key metrics
Indexing throughput and latency
Metrics such as Min/Median/Max throughput (ops/s) and latency percentiles (50th, 90th, 99th) indicate overall indexing speed. Higher throughput and lower latency are better. In the experiment UCloud achieved the shortest cumulative indexing time, roughly 24 % faster than Tencent Cloud and 16 % faster than Alibaba Cloud.
Merge time
Merge‑time metrics (min, median, max) should be as low as possible, while a higher merge count is desirable. UCloud again showed the lowest merge time and the highest merge count, indicating efficient background processing.
Node‑stats throughput & latency
Throughput is measured in ops/s (higher is better). Latency percentiles are measured in milliseconds (lower is better). All three providers had comparable throughput, but UCloud consistently delivered lower latency.
Aggregations (country_agg_uncached)
Throughput differences were minimal; latency varied, with UCloud showing the best numbers, followed by Alibaba Cloud, then Tencent Cloud.
Tips for smooth Rally execution
Pre‑download track data
Rally downloads track data from AWS S3, which can be slow or fail in some regions. Pre‑download the data to avoid network issues:
curl -O https://raw.githubusercontent.com/elastic/rally-tracks/master/download.sh
chmod u+x download.sh
./download.sh geonames
cd ~
tar -xf rally-track-data-geonames.tarAfter extraction, run Rally with the local track files.
Conclusion
Before deploying Elasticsearch in production, use Rally to benchmark candidate clusters. In this experiment UCloud delivered the best overall performance‑to‑cost ratio, though real‑world results will depend on specific workloads. The methodology and scripts can be reused for further evaluations.
Appendix
Test data referenced in the article can be downloaded from: http://suo.im/5wq8CW
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
