Operations 32 min read

Master PromQL: From Basics to Advanced Query Techniques for Monitoring

This comprehensive guide walks you through PromQL fundamentals, data types, query expressions, selectors, operators, aggregation, and essential functions, illustrating each concept with real‑world monitoring scenarios and code examples to help you effectively query and analyze time‑series data in Prometheus.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
Master PromQL: From Basics to Advanced Query Techniques for Monitoring

PromQL From Beginner to Expert

Table of Contents

Data Types

Gauge Type

Counter Type

Time Series Data

Understanding Time Series Data

Query Types

Query Selectors

Operators

Arithmetic Operators

Comparison Operators

Logical/Set Operators

Vector Matching

Aggregation Operations

Functions

absent_over_time

increase

rate

irate

histogram_quantile

_over_time

count_gt_over_time

Conclusion

For the Prometheus ecosystem, PromQL is an essential skill. This article focuses on the query language, mixing production scenarios to help you master it.

Data Types

Prometheus has four data types: Gauge, Counter, Histogram, and Summary. The most critical are Gauge and Counter; Histogram and Summary are conveniences for client‑side metric collection and can be viewed as combinations of Gauge and Counter.

Gauge Type

A Gauge represents the current state and can be positive, negative, large, or small. Examples include a VM instance status (0 for down, 1 for up), memory usage percentage, recent load, or the number of running processes. Gauges are useful when you care about the current value.

Counter Type

Counters are monotonically increasing values, such as total packets received on a network interface. The focus is on the increment or rate rather than the absolute value. Example output from ifconfig shows cumulative packet counts, which are typically sampled periodically (e.g., every 10 seconds) and require rate calculations on the server side.

eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.206.0.16 netmask 255.255.240.0 broadcast 10.206.15.255
inet6 fe80::5054:ff:fed2:a180 prefixlen 64 scopeid 0x20<link>
ether 52:54:00:d2:a1:80 txqueuelen 1000 (Ethernet)
RX packets 457952401 bytes 125894899868 (117.2 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 518040495 bytes 276312546157 (257.3 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

Time Series Data

PromQL queries time‑series data. Understanding the nature of time‑series data is a prerequisite for using PromQL effectively.

Understanding Time Series Data

Example: memory availability of five machines displayed as a line chart. Each point on the line is a data point (timestamp + value). Switching to Table view shows the latest values for each machine at a specific moment.

Memory availability chart
Memory availability chart
Table view of memory data
Table view of memory data

Prometheus uses the --query.lookback-delta=2m flag to fetch the latest sample within a 2‑minute window when an exact timestamp is missing.

Query Types

Expressions like mem_available_percent{app="clickhouse"} are query expressions. There are four result formats: Instant vector, Range vector, Scalar, and String. Instant queries return instant vectors; range queries (e.g., adding [1m]) return range vectors.

Query Selectors

Selectors filter series by label matchers. Supported operators are = (exact match), != (not equal), =~ (regex match), and !~ (regex not match).

{__name__="mem_available_percent", app="clickhouse"}

The name function can be used for regex matching on metric names.

Offset

The offset keyword retrieves historical data, e.g., sum(http_requests_total{method="GET"} offset 1d) compares current values with those from one day ago.

Operators

PromQL supports arithmetic (+, -, *, /, %, ^) and comparison (==, !=, >, <, >=, <=) operators, enabling server‑side calculations and alert logic.

Arithmetic Operators

+

-

*

/

%

^

Example: compute memory availability from raw metrics:

mem_available{app="clickhouse"} / mem_total{app="clickhouse"} * 100

If label sets differ (e.g., net_bytes_recv includes an interface label while mem_total does not), the operation yields no result.

net_bytes_recv{app="clickhouse"} / mem_total{app="clickhouse"}

Comparison Operators

Typical use: alert when memory availability drops below a threshold. mem_available_percent{app="clickhouse"} < 60 This expression can be used directly in alert rules; if it returns results, an alert is triggered.

Logical/Set Operators

and

or

unless

Example using and to filter disks with high usage only on small disks:

disk_used_percent{app="clickhouse"} > 70
and
disk_total{app="clickhouse"} / 1024 / 1024 / 1024 < 500

Example using or for load alerts:

system_load1{app="clickhouse"} > 8
or
system_load5{app="clickhouse"} > 8

Example using unless to exclude large disks:

disk_free{app="clickhouse"} / 1024 / 1024 / 1024 < 300
unless
disk_total{app="clickhouse"} / 1024 / 1024 / 1024 < 1024

Vector Matching

Vector matching aligns series based on common labels. Keywords on and ignoring restrict the label set used for matching.

mysql_slave_status_slave_sql_running == 0
and ON (instance)
mysql_slave_status_master_server_id > 0

Example with ignoring from Prometheus documentation:

method_code:http_errors:rate5m{code="500"}
/ ignoring(code)
method:http_requests:rate5m

group_left and group_right

These modifiers handle one‑to‑many or many‑to‑one matches. Example using group_left to attach label_version from kube_pod_labels to request rate vectors:

sum(rate(http_request_count{code=~"^(?:5..)$"}[5m])) by (pod)
*
on (pod) group_left(label_version) kube_pod_labels

Aggregation Operations

PromQL provides aggregation functions such as sum, min, max, avg, count, bottomk, topk, quantile, etc., to compute statistics across series.

avg(mem_available_percent{app="clickhouse"})
bottomk(2, mem_available_percent{app="clickhouse"})
avg(mem_available_percent{app=~"clickhouse|canal"}) by (app)

Functions

Prometheus offers many functions; the article highlights a few.

absent_over_time

Returns 1 when a range vector is empty, useful for no‑data alerts.

absent_over_time(system_load_norm_1{ident="tt-fc-dev02.nj"}[5m])

increase

Calculates the increase over a range, applying extrapolation when necessary.

increase(net_bytes_recv{interface="eth0"}[1m])
📌 The increase function extrapolates based on the first and last points, then scales to the requested interval.

rate

Computes per‑second rate, essentially increase / interval.

rate(net_bytes_recv{interface="eth0"}[1m]) == bool increase(net_bytes_recv{interface="eth0"}[1m]) / 60.0

irate

Uses the two most recent points for a more sensitive rate.

irate vs rate
irate vs rate

histogram_quantile

Estimates quantiles from histogram buckets. Example calculating the 90th percentile latency:

histogram_quantile(0.9, rate(http_request_duration_seconds_bucket[10m]))

For per‑job quantiles:

histogram_quantile(0.9, sum by (job, le) (rate(http_request_duration_seconds_bucket[10m])))

_over_time Functions

Functions ending with _over_time operate on range vectors, e.g., avg_over_time computes the average over the specified window.

avg_over_time(mem_available_percent{ident="10.3.4.5"}[1m])

count_gt_over_time

Counts how many samples in a range exceed a threshold, useful for alerting.

count_gt_over_time(interface_status[5m], 10) >= 3

Conclusion

The article covered core PromQL concepts, enriched with production examples. For deeper exploration, refer to the official Prometheus documentation.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

PrometheusPromQLTime Seriesquery language
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.