Operations 17 min read

Thanos vs VictoriaMetrics: Which Prometheus Long‑Term Storage Wins?

This article compares Thanos and VictoriaMetrics as Prometheus long‑term storage solutions, evaluating their architectures, write and read paths, reliability, data consistency, performance, scalability, high‑availability, and cost to help you choose the best fit for your monitoring stack.

Efficient Ops

Aug 5, 2024

1. Architecture

Thanos

Thanos consists of several core components:

Sidecar : runs alongside each Prometheus instance, uploads data older than two hours to object storage (e.g., S3 or GCS) and serves recent data to the Query component.

Store Gateway : provides stored object‑storage data to the Query component.

Query : implements the Prometheus query API, aggregates results from Sidecars and Store Gateways, and serves them to clients such as Grafana.

Compact : merges uploaded blocks into larger ones to improve query efficiency and reduce storage size.

Ruler : evaluates recording and alerting rules on global data, can generate new metrics, and optionally uploads results to object storage.

Receiver (experimental): implements the remote‑write API so Prometheus instances can push data directly.

VictoriaMetrics

VictoriaMetrics cluster edition includes three core components:

vmstorage : stores time‑series data.

vminsert : receives data from Prometheus via the remote‑write API and distributes it across vmstorage nodes.

vmselect : queries vmstorage nodes, aggregates results, and returns them to clients such as Grafana.

Each component can be scaled independently on suitable hardware.

2. Write Path Comparison

Configuration and Operational Complexity

Thanos requires disabling local TSDB block compression, deploying a Sidecar for each Prometheus instance, configuring the Sidecar, and setting up a Compactor for each object‑storage bucket.

VictoriaMetrics only needs a remote‑write configuration in Prometheus; no Sidecar or compression changes are required.

Reliability and Availability

Thanos uploads data in two‑hour blocks, so a disk failure can lose up to two hours of data per instance. The upload process shares resources with query handling, potentially affecting performance.

VictoriaMetrics writes each sample via remote‑write in near real‑time; only a few seconds of data could be lost on a disk failure.

Data Consistency

Thanos’ Compactor and Store Gateway can introduce eventual consistency issues when blocks are overwritten or deleted.

VictoriaMetrics provides strong consistency for stored data.

Performance

Thanos’ write performance is good, but heavy queries can slow Sidecar uploads. Compactor load can affect object‑storage bandwidth.

VictoriaMetrics adds minimal CPU overhead to Prometheus and can allocate sufficient CPU on the storage side to maintain performance.

Scalability

Thanos relies on object‑storage scalability; Sidecar upload speed depends on storage service.

VictoriaMetrics scales by adding more vminsert and vmstorage nodes.

3. Read Path Comparison

Configuration and Operational Complexity

Thanos requires Sidecar Store API, Store Gateway, and Query components to be deployed and connected.

VictoriaMetrics offers a ready‑to‑use Prometheus query API; only the data source in Grafana needs to point to vmselect.

Reliability and Availability

Thanos Query must connect to all Sidecars and Store Gateways, which can be problematic across data centers.

VictoriaMetrics queries stay within the cluster, offering higher reliability and faster startup.

Data Consistency

Both systems can return partial results when some nodes are unavailable, but VictoriaMetrics’ partial‑response option is optional and less likely to be used.

Performance

Thanos Query performance is limited by the slowest Sidecar or Store Gateway.

VictoriaMetrics query performance scales with the number of vmselect and vmstorage instances and is generally faster.

Scalability

Thanos Query is stateless and can be horizontally scaled, but the underlying Prometheus + Sidecar pair can become a bottleneck.

VictoriaMetrics allows independent scaling of vmselect and vmstorage, with optimizations for low‑bandwidth environments.

4. High‑Availability Comparison

Thanos achieves HA by running multiple Query instances in different zones; if a zone fails, only partial results may be returned.

VictoriaMetrics can replicate data across clusters in multiple zones, continuing to receive data and return full query results even when a zone is down.

5. Managed Cost Comparison

Thanos

GCS: $4‑$36 per TB depending on storage class; network egress $10/TB internal, $80‑$? external.

S3: $4‑$23 per TB; network egress $2‑$10/TB internal, $50‑$90/TB external; $0.10 per million API calls.

Costs depend on data volume, egress traffic, and API usage.

VictoriaMetrics

GCE HDD: $40/TB, SSD: $240/TB.

AWS EBS HDD: $45/TB, SSD: $125/TB.

VictoriaMetrics compresses data up to 10× more efficiently than Thanos, reducing storage cost.

Summary

VictoriaMetrics uses standard remote_write to ingest data and stores it on block storage, while Thanos requires disabling local compression and using a Sidecar to upload blocks to object storage.

VictoriaMetrics provides a built‑in global query API without extra components, whereas Thanos needs Sidecar, Store Gateway, and Query, making large deployments more complex.

Deploying VictoriaMetrics on Kubernetes is straightforward; Thanos deployment and configuration are considerably more involved.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Cloud VictoriaMetrics Thanos

Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.