How to Build a High‑Availability Prometheus Setup Using Federation and Multi‑Remote‑Read
This article examines common misuse of Prometheus federation, explains its limitations, and presents a pure‑Prometheus solution using multi_remote_read to achieve high‑availability monitoring, including configuration examples, code analysis, and best‑practice recommendations for proper data aggregation and query merging.
Introduction
Many users misuse Prometheus federation, using it to collect data from multiple scrapers without understanding its purpose. This article analyzes federation problems and proposes a solution based entirely on Prometheus multi_remote_read.
Architecture Diagram
Federation Problems
Federation documentation: https://prometheus.io/docs/prometheus/latest/federation/
Federation Usage Example
Essentially a scrape cascade: a scrapes from b, c, d.
Can use match to select specific metrics.
Official example configuration:
scrape_configs:
- job_name: 'federate'
scrape_interval: 15s
honor_labels: true
metrics_path: '/federate'
params:
'match[]':
- '{job="prometheus"}'
- '{__name__=~"job:.*"}'
static_configs:
- targets:
- 'source-prometheus-1:9090'
- 'source-prometheus-2:9090'
- 'source-prometheus-3:9090'Analysis of the Example
The scrape path is /federate. The handler is registered in web.go:
// web.go's federate Handler
router.Get("/federate", readyf(httputil.CompressionHandler{Handler: http.HandlerFunc(h.federation)}.ServeHTTP))Federation reads local storage data and processes it.
The core federation function merges series from local storage and encodes them:
func (h *Handler) federation(w http.ResponseWriter, req *http.Request) {
q, err := h.localStorage.Querier(req.Context(), mint, maxt)
defer q.Close()
vec := make(promql.Vector, 0, 8000)
hints := &storage.SelectHints{Start: mint, End: maxt}
var sets []storage.SeriesSet
set := storage.NewMergeSeriesSet(sets, storage.ChainedSeriesMerge)
for set.Next() {
s := set.At()
vec = append(vec, promql.Sample{Metric: s.Labels(), Point: promql.Point{T: t, V: v}})
}
// encode and write response ...
}If no filtering is applied, federation merely aggregates all shards together, which is useless when data volume is large.
Correct Federation Practices
Use match to filter metrics, separating them into two categories:
Data that needs further aggregation – collected via federation.
Data that can stay on the local scraper.
Perform pre‑aggregation and alerting on the federated side to improve query speed.
Default Prometheus Does Not Support Down‑sampling
Increasing scrape_interval in federation can simulate down‑sampling.
True down‑sampling requires aggregation algorithms (e.g., 5‑minute average, max, min) rather than merely reducing scrape frequency.
Unified Query Implementation
What is remote_read ?
Prometheus uses remote_read to read from external storage when its local store lacks high availability.
Configuration documentation: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_read
Supported Read/Write Storages
AWS Timestream
Azure Data Explorer
Cortex
CrateDB
Google BigQuery
Google Cloud Spanner
InfluxDB
IRONdb
M3DB
PostgreSQL/TimescaleDB
QuasarDB
Splunk
Thanos
TiKV
multi_remote_read
Configuring multiple remote_read endpoints enables concurrent reads from several back‑ends and merges the results:
remote_read:
- url: "http://172.20.70.205:9090/api/v1/read"
read_recent: true
- url: "http://172.20.70.215:9090/api/v1/read"
read_recent: trueThe merge allows PromQL queries and alert rules to ignore the physical location of data.
Prometheus Can Remote‑Read Itself
By enabling --enable-feature=remote-write-receiver, a Prometheus instance can act as both writer and reader, eliminating the need for an external storage for remote reads.
High‑Availability Solution
Combine multiple Prometheus scrapers with stateless Prometheus query nodes to achieve HA:
Monitoring data resides on several local Prometheus instances (bare‑metal or Kubernetes StatefulSets).
Query nodes configure multiple /api/v1/read/ endpoints in remote_read.
Handling Duplicate Data
Query merging automatically deduplicates overlapping data, allowing the same job to be scraped by multiple Prometheus instances for redundancy.
Drawbacks
Concurrent queries must wait for the slowest backend, increasing latency.
Uncontrolled heavy queries can overload scrapers.
All queries are sent to every scraper, causing unnecessary load on nodes that do not hold the requested data.
Bloom filters in back‑ends like M3DB can mitigate unnecessary lookups.
Routing Optimization (Optional)
For precise query routing, refer to the open‑source project prome‑route , which uses reverse proxy rules based on feature labels to shard Prometheus data.
Footnotes
[1] m3db resource overhead, aggregation, down‑sampling, query limits: https://zhuanlan.zhihu.com/p/359551116
[2] m3db‑node OOM tracing and memory allocator code: https://zhuanlan.zhihu.com/p/183815841
[3] Federation documentation: https://prometheus.io/docs/prometheus/latest/federation/
[4] Remote read configuration: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_read
[5] InfluxDB Prometheus support: https://docs.influxdata.com/influxdb/v1.8/supported_protocols/prometheus/
[6] M3DB integration: https://m3db.io/docs/integrations/prometheus/
[7] prome‑route project: https://zhuanlan.zhihu.com/p/231914857
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Programmer DD
A tinkering programmer and author of "Spring Cloud Microservices in Action"
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
