Operations 22 min read

Building Highly Available Prometheus Monitoring with Thanos: A Practical Guide

This article explains why native Prometheus HA solutions fall short for large, multi‑region clusters and shows how to use Thanos components—including sidecar, query, store gateway, and compactor—to achieve long‑term storage, unlimited scaling, a global view, and non‑intrusive integration with existing Prometheus deployments.

Efficient Ops

Apr 12, 2023

Building Highly Available Prometheus Monitoring with Thanos: A Practical Guide

Background

In the "High‑availability Prometheus: FAQ" article we briefly mentioned HA solutions for Prometheus. After trying federation and Remote Write, we chose Thanos as the monitoring suite to manage a global view of more than 300 clusters across multiple regions.

Official Prometheus HA Options

HA: two identical Prometheus instances behind a load balancer.

HA + remote storage: write to a remote store for persistence.

Federation: shard data by function, store it in a global node.

Using the official multi‑replica + federation still causes problems because Prometheus local storage lacks data synchronization, making consistency difficult.

Replica A may lose data during a crash, causing gaps when the load balancer routes requests to it.

Different start times or clocks produce mismatched timestamps across replicas.

Federation still has single‑point‑of‑failure nodes; edge and global nodes may become bottlenecks.

Sensitive alerts should avoid triggering from the global node due to potential latency.

Most Prometheus clustering solutions ensure data consistency from storage and query perspectives:

Storage side: use an adapter with Remote Write so only one replica pushes data to the TSDB.

Storage side: write to two TSDBs and sync them.

Query side: solutions like Thanos or VictoriaMetrics keep two copies of data but deduplicate and join them at query time. Thanos stores data in object storage via a sidecar, while VictoriaMetrics uses its own server.

Actual Requirements

Long‑term storage of ~1 month of data, adding tens of gigabytes per day, with low maintenance cost, disaster recovery, and migration capability.

Unlimited scaling: >300 clusters, thousands of nodes, and many services, requiring sharding by function or tenant.

Global view: a single Grafana dashboard showing metrics from all regions, clusters, and pods.

Non‑intrusive: no modifications to Prometheus code; the solution must stay compatible with rapid Prometheus releases.

After evaluating open‑source (Cortex, Thanos, VictoriaMetrics, StackDriver) and commercial products, we selected Thanos because it satisfies long‑term storage, unlimited scaling, global view, and non‑intrusiveness.

Thanos Architecture

Thanos default mode is the sidecar approach.

Besides sidecar, Thanos also provides a less common receive mode.

Thanos consists of the following components:

Bucket

Check

Compactor

Query

Rule

Sidecar

Store

receive (optional)

downsample (optional)

All components are built from a single binary; different commands enable different functionality (e.g., ./thanos query, ./thanos sidecar).

Components and Configuration

Step 1: Prepare Prometheus

Deploy a Prometheus instance (pod or host). Example launch command:

./prometheus \
  --config.file=prometheus.yml \
  --log.level=info \
  --storage.tsdb.path=data/prometheus \
  --web.listen-address='0.0.0.0:9090' \
  --storage.tsdb.max-block-duration=2h \
  --storage.tsdb.min-block-duration=2h \
  --storage.tsdb.wal-compression \
  --storage.tsdb.retention.time=2h \
  --web.enable-lifecycle

Key points:

Enable web.enable-lifecycle for hot‑reloading.

Retention of 2 h creates blocks that Thanos uploads to object storage.

Prometheus prometheus.yml (excerpt):

global:
  scrape_interval: 60s
  evaluation_interval: 60s
  external_labels:
    region: 'A'
    replica: 0

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['0.0.0.0:9090']
  - job_name: 'demo-scrape'
    metrics_path: '/metrics'
    params:
      ...

Declare external_labels for region and replica identification.

Step 2: Deploy Sidecar

The sidecar runs in the same pod as Prometheus and provides two functions:

Exposes Prometheus data via Thanos Store API using Remote Read.

Optionally uploads each TSDB block to an object store.

Sidecar command example:

./thanos sidecar \
  --prometheus.url="http://localhost:8090" \
  --objstore.config-file=./conf/bos.yaml \
  --tsdb.path=/home/work/opdir/monitor/Prometheus/data/Prometheus/

Configure object store (e.g., GCS):

type: GCS
config:
  bucket: ""
  service_account: ""

Step 3: Deploy Query

The query component implements the Prometheus HTTP v1 API and aggregates data from multiple Store APIs (sidecars and store‑gateway).

./thanos query \
  --http-address="0.0.0.0:8090" \
  --store=replica0:10901 \
  --store=replica1:10901 \
  --store=replica2:10901 \
  --store=127.0.0.1:19914

Two important UI checkboxes:

deduplication : removes duplicate series across replicas.

partial response : allows returning data from available replicas when some are down.

Step 4: Deploy Store‑Gateway

Store‑gateway reads persisted blocks from object storage and serves them via the Store API for historical queries.

./thanos store \
  --data-dir=./thanos-store-gateway/tmp/store \
  --objstore.config-file=./thanos-store-gateway/conf/bos.yaml \
  --http-address=0.0.0.0:19904 \
  --grpc-address=0.0.0.0:19914 \
  --index-cache-size=250MB \
  --sync-block-duration=5m \
  --min-time=-2w \
  --max-time=-1h

Store‑gateway can be horizontally scaled; each instance may consume significant CPU and memory.

Step 5: Visualize Data

With all components running, add the Thanos query endpoint to Grafana to obtain a unified view of metrics across regions, clusters, and pods.

Receive Mode

Receive mode replaces the sidecar’s reliance on Prometheus for the most recent 2 h of data by using Remote Write directly. It is useful when network policies prevent access to in‑cluster Prometheus or when a clear separation between tenant and control planes is required.

Some Issues

Prometheus Compression

When using sidecar, set --storage.tsdb.min-block-duration and --storage.tsdb.max-block-duration to the same value (2 h) to disable Prometheus’s internal compaction, preventing upload failures.

Store‑Gateway Resource Usage

Store‑gateway’s index cache and block sync can consume large memory; parameters like --index-cache-size, --sync-block-duration, --min-time, and --max-time can be tuned to control cache size and query windows.

Compactor Component

Compactor merges old blocks and performs down‑sampling. While it reduces query latency for long‑range queries, it does not reduce disk usage; it actually increases it due to higher‑dimensional aggregates.

Query Deduplication

The query component deduplicates series based on the query.replica-label. When multiple replicas return differing values, Thanos selects the most stable replica using an internal scoring algorithm.

References

https://thanos.io/

https://www.percona.com/blog/2018/09/20/Prometheus-2-times-series-storage-performance-analyses/

https://qianyongchao.blog/2019/01/03/Prometheus-thanos-design-介绍/

https://github.com/thanos-io/thanos/issues/405

https://katacoda.com/bwplotka/courses/thanos

https://medium.com/faun/comparing-thanos-to-victoriametrics-cluster-b193bea1683

https://www.youtube.com/watch?v=qQN0N14HXPM

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

monitoring Observability High Availability Kubernetes Prometheus Thanos

Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Background

Official Prometheus HA Options

Actual Requirements

Thanos Architecture

Components and Configuration

Step 1: Prepare Prometheus

Step 2: Deploy Sidecar

Step 3: Deploy Query

Step 4: Deploy Store‑Gateway

Step 5: Visualize Data

Receive Mode

Some Issues

Prometheus Compression

Store‑Gateway Resource Usage

Compactor Component

Query Deduplication

References

Efficient Ops

How this landed with the community

Was this worth your time?

0 Comments

Step 1: Prepare Prometheus

Step 2: Deploy Sidecar

Step 3: Deploy Query

Step 4: Deploy Store‑Gateway

Step 5: Visualize Data