Operations 7 min read

Mastering Prometheus Histograms: How Cumulative Buckets Simplify Metrics

This article explains the fundamentals of Prometheus histogram metrics, illustrates why they are cumulative, shows how to drop unwanted buckets with relabeling, and demonstrates quantile calculations using the histogram_quantile function, providing practical examples and code snippets for effective monitoring.

Programmer DD

Aug 13, 2019

Mastering Prometheus Histograms: How Cumulative Buckets Simplify Metrics

1. What Is a Histogram?

Prometheus defines a histogram as a metric that samples observations (such as request latency or response size) into configurable buckets over a time window. For example, to monitor an application's response time ranging from 0 s to 10 s, we can divide this range into buckets of 0.2 s each. The first bucket counts requests ≤ 0.2 s, the second counts requests > 0.2 s and ≤ 0.4 s, and so on.

2. Why a Cumulative Histogram?

Unlike a regular histogram, Prometheus histograms are cumulative: each bucket includes the counts of all previous buckets. This design reduces the complexity of analysis when many labels or buckets are present. By being cumulative, users can drop higher‑resolution buckets at scrape time, lowering storage and computation costs while still being able to estimate quantiles.

For instance, to discard all samples with latency ≤ 100 ms, you can use a relabel configuration that matches the metric name example_latency_seconds_bucket and the le label value 0.0x, then drop those samples. You cannot drop the le="+Inf" bucket because the histogram_quantile function requires it.

Even if all buckets are dropped, the _sum and _count metrics remain, allowing calculation of the average response time.

3. Quantile Calculation

Prometheus uses the histogram_quantile function to estimate quantiles. The function assumes a linear distribution of samples within each bucket, so the accuracy depends on bucket granularity—the finer the buckets, the more precise the estimate.

For example, with 10 000 samples where the 9 501‑st sample falls into the 8th bucket (which contains 368 samples, with the 9 501‑st sample being the 93‑rd in that bucket), the quantile can be computed using the source code from promql/quantile.go:

return bucketStart + (bucketEnd - bucketStart) * float64(rank / count)

Applying this formula for the 0.95 quantile yields a value very close to the exact quantile.

4. Summary

The article covered how Prometheus histograms work, why they are cumulative, how to drop unwanted buckets via relabeling, and how to compute quantiles using histogram_quantile. The next article will explore the Summary metric type.

5. References

Prometheus and Histograms

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Monitoring Observability Metrics prometheus Quantile Histogram

Written by

Programmer DD

A tinkering programmer and author of "Spring Cloud Microservices in Action"

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.