Comparative Analysis of VictoriaMetrics and Thanos for Large‑Scale Metric Storage
This article examines the migration from Thanos to VictoriaMetrics for large‑scale metric storage, detailing background challenges, VictoriaMetrics architecture and storage engine, data write and read processes, and a comparative analysis of performance, scalability, and operational costs between the two systems.
Background
Since 2022 the company has promoted metric‑based observability across many systems, causing the number of metrics to grow from a few million to nearly 100 million. Thanos, used as a Prometheus extension for long‑term storage, began to show bottlenecks, high resource consumption, long query latency, and occasional service unavailability.
After extensive optimization attempts, the team evaluated open‑source alternatives and decided to migrate the monitoring storage from Thanos to VictoriaMetrics.
VictoriaMetrics Overview
VictoriaMetrics is a fast, efficient, and horizontally scalable time‑series database that can serve as Prometheus long‑term storage. It offers low CPU, memory, and storage usage while maintaining high query speed. Key advantages include full Prometheus compatibility, high performance, high compression ratio, easy operation, high availability, global query capability, and a strong community.
VictoriaMetrics Technical Process
Architecture
Storage Engine Features
VictoriaMetrics relies on three core components: LSM Tree, SSTable, and TSID (Time Series ID). Together they provide high‑throughput writes, efficient indexing, and compact storage.
LSM Tree (Log‑Structured Merge‑Tree)
The LSM tree is optimized for high‑write workloads by converting random writes into sequential writes. New data is first written to an in‑memory Memtable , which is flushed to an immutable SSTable file once it reaches a size threshold.
Write flow: Data arrives, is stored in Memtable , then flushed to disk as an SSTable .
Compaction: Periodic background merges combine multiple SSTable files into larger ones, reducing fragmentation and improving query performance.
SSTable (Sorted String Table)
An immutable file containing ordered key‑value pairs. It enables fast point and range queries through multi‑level indexes, Bloom filters, and sparse indexes.
Write and storage: When Memtable is flushed, a new SSTable is created; updates generate new files.
Index and lookup: Built‑in indexes allow rapid key location.
Merge and compression: Background merges deduplicate data and apply compression algorithms to reduce space.
TSID (Time Series ID)
TSID is a unique hash generated from a series' label set. It ensures distinct identification, efficient indexing, and balanced distribution across storage nodes.
Uniqueness: Guarantees each series is uniquely identifiable.
Efficient indexing: Enables fast lookups based on TSID.
Distributed storage optimization: Helps evenly spread data and avoid hotspots.
Data Ingestion Process
Data Reception ( vminsert )
All incoming metrics are sent to the vminsert component, which parses the data and converts it to an internal format.
TSID is calculated by hashing the label set.
Consistent hashing selects an appropriate vmstorage node for storage, ensuring uniform data distribution.
Data Storage ( vmstorage )
Writes are first recorded in a Write‑Ahead Log (WAL) and stored in an in‑memory Memtable organized by TSID.
When the Memtable reaches a threshold, it is flushed to disk as an immutable SSTable file.
Background processes periodically merge multiple SSTable files to reduce file count and improve query speed.
Data Retrieval Process
Query Request ( vmselect )
Users (e.g., via Grafana) send PromQL queries to vmselect , which parses the query and determines the required TSIDs.
Based on TSIDs, vmselect contacts the relevant vmstorage nodes.
Data Lookup and Processing ( vmstorage )
In‑memory lookup: vmselect first checks the Memtable for matching data.
Disk lookup: If needed, it searches SSTable files using multi‑level indexes and Bloom filters.
Parallel processing: Queries are executed in parallel across multiple storage nodes.
Data Merge and Aggregation ( vmselect )
Results from multiple vmstorage nodes are merged, de‑duplicated by TSID, and aggregated (e.g., sum, average) in real time.
The final result set is returned to the user.
VictoriaMetrics vs Thanos
Architectural Comparison
Thanos consists of Sidecar, Store Gateway, Compactor, Querier, and Receive components, providing high availability but a complex architecture.
VictoriaMetrics has a simpler design with three components: vminsert , vmstorage , and vmselect , offering easier management and higher performance.
Data Write
Thanos: Prometheus writes locally, Sidecar uploads data to object storage, and optional remote write to Receive adds latency and complexity.
VictoriaMetrics: vminsert writes directly to vmstorage using TSID, LSM‑based Memtable, WAL, and periodic SSTable flushes, resulting in low‑latency, high‑throughput writes.
Data Read
Thanos: Querier aggregates data from multiple Store Gateways and object storage, incurring higher latency due to remote accesses.
VictoriaMetrics: vmselect queries vmstorage nodes directly, leveraging efficient indexes and parallelism for fast responses.
Operational Cost
Thanos: Requires multiple components and external object storage, leading to higher hardware, storage, and maintenance costs.
VictoriaMetrics: Fewer components and efficient compression lower hardware and storage requirements, simplifying operations and reducing cost.
Summary
Architecture: Thanos is complex; VictoriaMetrics is streamlined.
Write Path: VictoriaMetrics offers lower latency and higher throughput.
Read Path: VictoriaMetrics provides superior query performance, especially under high concurrency.
Cost: VictoriaMetrics has lower resource and operational expenses.
Soul Technical Team
Technical practice sharing from Soul
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.