Operations 8 min read

Why Prometheus Uses TSDB: Mastering Scalable Monitoring and Queries

This article explains how Prometheus, a data‑driven monitoring system, leverages a time‑series database (TSDB) to handle massive metric volumes, perform efficient queries, and enable powerful calculations such as recording rules for pre‑computed results.

MaGe Linux Operations

Apr 2, 2022

Why Prometheus Uses TSDB: Mastering Scalable Monitoring and Queries

Background

For many people, the unknown and uncontrollable can trigger subconscious avoidance; the author felt the same when first encountering Prometheus, which can seem daunting due to its many concepts and high entry barrier.

Concepts: Instance, Job, Metric, Metric Name, Metric Label, Metric Value, Metric Type (Counter, Gauge, Histogram, Summary), DataType (Instant Vector, Range Vector, Scalar, String), Operator, Function

As Ma said, "Although Alibaba is the world’s largest retail platform, it is a data company, not a retail company." Similarly, Prometheus is fundamentally a data‑based monitoring system.

Daily Monitoring

Assume we need to monitor the request volume of each API on WebServerA, with dimensions such as service name (job), instance IP (instance), API name (handler), method, response code (code), and request count (value).

Example SQL‑like queries:

Query request count where method="put" and code="200" (red box):

SELECT * FROM http_requests_total WHERE code="200" AND method="put" AND created_at BETWEEN 1495435700 AND 1495435710;

Query request count where handler="prometheus" and method="post" (green box):

SELECT * FROM http_requests_total WHERE handler="prometheus" AND method="post" AND created_at BETWEEN 1495435700 AND 1495435710;

Query request count where instance="10.59.8.110" and handler starts with "query" (green box):

SELECT * FROM http_requests_total WHERE handler="query" AND instance="10.59.8.110" AND created_at BETWEEN 1495435700 AND 1495435710;

From these examples, daily monitoring involves dimension‑based queries combined with time ranges. Monitoring 100 services, each with 10 instances, 20 APIs, 4 methods, collecting data every 30 seconds and retaining 60 days yields about 13.8 billion data points, which is infeasible for relational databases like MySQL. Hence Prometheus uses a TSDB storage engine.

Storage Engine

TSDB perfectly fits the monitoring data scenario.

Enormous data volume

Predominantly write operations

Writes are mostly sequential, ordered by time

Rarely writes old data or updates existing data

Deletes are block‑based, removing whole time ranges

Data size typically exceeds memory; caching has little effect

Read operations are ordered (ascending or descending)

High‑concurrency reads are common

How does TSDB achieve this?

{"labels":[{"latency":"500"}],"samples":[{"timestamp":1473305798,"value":0.9}]}

Raw data consists of two parts: labels (monitoring dimensions) and samples (timestamp and value). Labels uniquely identify a time series (series_id); samples hold the actual metric values.

TSDB stores series using timeseries:doc:: as the key and builds three indexes to accelerate queries: Series, Label Index, and Time Index.

Example with label latency:

Series

Stores all label key‑value pairs in lexical order (series) and an index of time windows pointing to data blocks, allowing fast skipping of irrelevant records during queries.

Label Index

Each label is stored as index:label: key, containing a list of all its values and references to the starting position of the corresponding series.

Time Index

Data is stored with index:timeseries:: keys pointing to files for specific time intervals.

Data Computation

The robust storage engine enables powerful data computation, distinguishing Prometheus from other monitoring services. Users can query different metric series, apply basic operators and advanced functions, and perform matrix operations on metric series.

This capability makes Prometheus comparable to a “data warehouse + compute platform” for monitoring, indicating the future direction of monitoring systems.

One Computation, Many Queries

Such powerful computation consumes significant resources, so pre‑computing results is often faster than evaluating raw expressions each time. Prometheus provides Recording Rules to pre‑compute frequently used or expensive expressions and store them as new time series, achieving one computation, many queries.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

query optimization Prometheus data storage TSDB Time-series

Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.