Operations 7 min read

Unlocking Prometheus: How TSDB Powers Scalable Monitoring and Real-Time Analytics

This article explains how Prometheus uses a time‑series database (TSDB) to handle massive monitoring data, detailing its concepts, query examples, storage engine design, indexing mechanisms, and the benefits of pre‑computing expressions for efficient real‑time analysis.

Efficient Ops
Efficient Ops
Efficient Ops
Unlocking Prometheus: How TSDB Powers Scalable Monitoring and Real-Time Analytics

Background

Many beginners feel overwhelmed by Prometheus because it introduces many concepts and a steep learning curve. Core concepts include Instance, Job, Metric, Metric Name, Metric Label, Metric Value, Metric Type (Counter, Gauge, Histogram, Summary), Data Types (Instant Vector, Range Vector, Scalar, String), Operators, and Functions. Like Alibaba’s data‑driven approach, Prometheus is fundamentally a data‑centric monitoring system.

Daily Monitoring

To monitor each API of a web server (e.g., WebServerA), dimensions such as service name (job), instance IP (instance), API name (handler), method, response code, and request count are tracked.

Example SQL‑like queries:

<code>SELECT * FROM http_requests_total WHERE code="200" AND method="put" AND created_at BETWEEN 1495435700 AND 1495435710;</code>
<code>SELECT * FROM http_requests_total WHERE handler="prometheus" AND method="post" AND created_at BETWEEN 1495435700 AND 1495435710;</code>
<code>SELECT * FROM http_requests_total WHERE handler="query" AND instance="10.59.8.110" AND created_at BETWEEN 1495435700 AND 1495435710;</code>

When monitoring hundreds of services with many instances, APIs, and methods, the data volume quickly reaches billions of rows, making traditional relational databases impractical. Therefore, Prometheus adopts a Time‑Series Database (TSDB) as its storage engine.

Storage Engine

TSDB fits the monitoring workload perfectly:

Massive data volume.

Predominantly write‑heavy operations.

Writes are mostly sequential, ordered by time.

Rare updates; data is written shortly after collection.

Deletion occurs in block ranges, not individual points.

Data size exceeds memory, limiting cache effectiveness.

Reads are typically ordered scans (ascending or descending).

High‑concurrency reads are common.

TSDB stores data as two parts:

labels

(dimension tags) and

samples

(timestamp‑value pairs). Labels uniquely identify a time series, while samples hold the actual metric values.

<code>{"labels": [{"latency": "500"}], "samples": [{"timestamp": 1473305798, "value": 0.9}]}</code>

The internal structure can be visualized as:

<code>series
│ ... server{latency="500"}
│ ... server{latency="300"}
│ ... server{}
│
<-------- time --------></code>

TSDB uses

timeseries:doc::

as the key for values and builds three indexes to accelerate queries:

Series Index : stores ordered label‑key pairs.

Label Index : maps each label to a list of its values and references to the corresponding series.

Time Index : maps time ranges to data blocks, allowing fast skipping of irrelevant segments.

Data Computation

The powerful storage engine enables complex calculations. Prometheus can select multiple metric series, apply arithmetic operators, and use built‑in functions to perform matrix operations, effectively providing both a data warehouse and a computation platform for monitoring.

One Calculation, Multiple Queries

Because such calculations are resource‑intensive, pre‑computing results is advantageous. Prometheus offers Recording Rules to evaluate expensive expressions in advance and store the results as new time series, enabling a single computation to serve many queries and alerts.

monitoringoperationsmetricsPrometheustime-series databaseTSDB
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.