Operations 12 min read

How to Size and Benchmark Your Elasticsearch Cluster for Logs and Metrics

This guide explains how to allocate hardware resources, calculate Elasticsearch cluster size based on data volume, and conduct indexing and search benchmarks using Rally to ensure production‑grade performance and capacity planning.

dbaplus Community

Oct 19, 2022

How to Size and Benchmark Your Elasticsearch Cluster for Logs and Metrics

1. Hardware Resource Allocation

Performance depends on the tasks Elasticsearch performs and the platform it runs on. Key resources are:

Disk : Prefer SSDs, especially for nodes handling search and indexing; use a hot‑tier architecture to control costs; Elasticsearch does not require RAID, only at least one replica shard for fault tolerance.

Memory : JVM stores metadata about clusters, indices, shards, segments and fielddata—recommended to allocate about 50% of available RAM. The remaining RAM is used as OS cache to reduce disk reads during full‑text search, aggregations and sorting.

CPU : The number and speed of CPU cores determine average operation speed and peak throughput.

Network : Bandwidth and latency affect inter‑node communication and cross‑cluster features such as CCR.

2. Determining Cluster Size from Data Volume

Key questions include daily raw data volume, retention period, hot‑tier and warm‑tier storage durations, and required replica count. Add a 5‑10% safety margin and one spare node for fault tolerance.

Formulas used:

DataTotal(GB) = DailyRaw(GB) * RetentionDays * (Replicas + 1)

StorageTotal(GB) = DataTotal * (1 + 0.15 + 0.1)

DataNodeCount = ROUNDUP(StorageTotal / (NodeMemory * MemoryDataRatio))

2.1 Small‑scale Cluster Example

Assume 1 GB of data per day, retained for 9 months, each node provides 8 GB RAM (30 GB JVM, rest OS cache).

DataTotal = 1 GB × 9 × 30 days × 2 = 540 GB

StorageTotal = 540 GB × (1 + 0.15 + 0.1) = 675 GB

DataNodeCount ≈ 3 nodes

2.2 Large‑scale Cluster Example

Assume 100 GB per day, hot tier stored for 30 days, warm tier for 12 months, each node has 64 GB RAM (30 GB JVM, rest OS cache). Hot‑tier memory‑to‑data ratio is 1:30, warm‑tier ratio is 1:160.

Hot tier data = 100 GB × 30 days × 2 = 6000 GB

Hot tier storage = 6000 GB × (1 + 0.15 + 0.1) = 7500 GB

Hot tier nodes = ROUNDUP(7500 / 64 / 30) + 1 = 5 nodes

Warm tier data = 100 GB × 365 days × 2 = 73000 GB

Warm tier storage = 73000 GB × (1 + 0.15 + 0.1) = 91250 GB

Warm tier nodes = ROUNDUP(91250 / 64 / 160) + 1 = 10 nodes

3. Benchmarking

Use the open‑source tool Rally to benchmark both indexing and search performance. Separate benchmarks help verify that the cluster meets expected SLA before production deployment.

3.1 Indexing Benchmark

Goal: determine maximum indexing throughput, daily ingest capacity, and whether the cluster is over‑ or under‑sized. Test on a 3‑node cluster (8 vCPU, HDD, 32 GB RAM).

Dataset: Metricbeat – 1,079,600 documents, 1.2 GB total, average document size 1.17 KB.

Optimal bulk size ≈ 12,000 documents (≈13.7 MB) → ~13,000 requests/s.

Optimal client count = 16 → ~62,000 requests/s.

Results by node/shard configuration:

1 node, 1 shard → 22,000 requests/s

2 nodes, 2 shards → 43,000 requests/s

3 nodes, 3 shards → 62,000 requests/s

3.2 Second Indexing Benchmark (HTTP Server Logs)

Dataset: 31.1 GB, 247,249,096 documents, average size 0.8 KB.

Optimal bulk size = 16,000 documents.

Optimal client count = 32.

Maximum indexing throughput = 220,000 requests/s.

4. Search Benchmark

Target: 20 clients, 1,000 operations per second (OPS). Three test groups:

Query latency (90th percentile) on Metricbeat and HTTP log datasets.

Parallel query latency when queries run concurrently.

Parallel indexing impact on query latency.

Findings:

Certain queries (e.g., auto-data-histogram-with-tz, desc_sort_timestamp) exhibit longer latency.

Parallel execution increases 90th‑percentile latency.

When indexing runs in parallel with searching, query latency also rises.

Achieved search throughput ≈ 1,000 requests/s with 20 users.

5. Conclusion

Applying the sizing formulas and running realistic Rally benchmarks provides a systematic way to estimate the number of nodes required for a given data volume and workload. For reliable capacity planning, always benchmark with data and queries that closely resemble production use cases.

Elasticsearch search performance benchmarking Cluster Sizing Rally indexing throughput

Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.