How to Accurately Estimate Ceph Cluster IOPS Performance: Formulas and Factors
This article explains how to estimate Ceph storage cluster IOPS by analyzing replication, erasure coding, FileStore vs BlueStore, write amplification, and the impact of network and CPU resources, providing practical formulas for realistic performance planning.
Ceph, a leading software‑defined storage solution, offers highly flexible, modular components that can be assembled like a custom PC, making it popular in cloud computing.
Many wonder what read/write performance a Ceph cluster can achieve and how to predict it without costly trial‑and‑error testing.
Initially one might assume the cluster’s performance equals the sum of all disks, e.g., 100 disks each delivering 500 IOPS would give 50 K IOPS, but this ignores replication, write‑amplification factor (WAF), metadata, and other overheads.
Ceph allows configurable replica counts (e.g., three‑copy, two‑copy, or single‑copy) and erasure‑coding schemes (K+M). Adjusting these parameters changes usable capacity and write amplification, influencing overall performance.
Ceph stores data as objects; these objects are grouped into placement groups (PGs) that distribute them evenly across OSDs, so while total write speed is not a simple sum of disks, multiple disks still contribute cumulatively to perceived performance.
To estimate performance based on hardware, the following formulas can be used (see diagram):
FileStore + Multiple Replicas
Assuming each disk is an OSD with journal on the same disk (write‑amplification 2) and a replica count of M, the IOPS formulas (ignoring network/CPU limits) are: 4K random read IOPS = R × N × 0.7 4K random write IOPS = W × N × 0.7 / (2 × M)
When using a dedicated NVMe journal (write‑amplification 1), the write formula becomes: 4K random write IOPS = W × N × 0.7 / M
BlueStore + Multiple Replicas
With BlueStore eliminating the journal, the write‑amplification reduces to 1, yielding: 4K random read IOPS = R × N × 0.7 4K random write IOPS = W × N × 0.7 / M
FileStore + Erasure Coding
For K=3, M=2 erasure coding, write‑amplification becomes (K+M)/K × 2, so: 4K random read IOPS = R × N × 0.7 4K random write IOPS = W × N × 0.7 × 2 K / (K+M)
BlueStore + Erasure Coding
Without journaling, write‑amplification simplifies to (K+M)/M, giving: 4K random read IOPS = R × N × 0.7 4K random write IOPS = W × N × 0.7 K / (K+M)
These calculations consider only disk performance; network bandwidth and CPU also limit cluster throughput. For example, a node with ten 100 MB/s SATA disks yields ~8 GB/s read bandwidth, but a 2 Gbps network caps the effective throughput. Similarly, CPU cores may restrict write IOPS.
Understanding and balancing these factors is essential for realistic Ceph performance planning.
UCloud Tech
UCloud is a leading neutral cloud provider in China, developing its own IaaS, PaaS, AI service platform, and big data exchange platform, and delivering comprehensive industry solutions for public, private, hybrid, and dedicated clouds.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
