An Introduction to Prometheus: Metrics Collection, Storage, Querying, Visualization and Alerting
Prometheus is an open‑source monitoring system that scrapes metrics from services or exporters, stores them in a time‑series database, lets users query with PromQL, visualizes data via its web UI or Grafana, and sends alerts through Alertmanager, supporting custom Go metrics, various discovery methods, and four metric types.
Prometheus is an open‑source, end‑to‑end monitoring solution that provides metric exposition, scraping, storage, visualization and alerting.
The system is built around several components: metric exporters (or the Pushgateway), a time‑series database, a functional query language (PromQL), a web UI, and Alertmanager for notifications.
Metric exposition – Each monitored service is represented as a Job with one or more targets . Metrics can be exposed directly by the service, via community exporters (e.g., MySQL, Kafka), or pushed through a Pushgateway . Registration can be static (hard‑coded in scrape_configs ) or dynamic using service‑discovery mechanisms such as Consul, DNS, or Kubernetes.
scrape_configs:
- job_name: "prometheus"
static_configs:
- targets: ["localhost:9090"]Scraping – Prometheus uses a pull model. The global scrape_interval (default 15s) controls how often metrics are collected. The configuration can be adjusted per job.
global:
scrape_interval: 15sStorage and query – Collected samples are stored in an internal TSDB. Users query the data with PromQL via the built‑in web UI or external tools like Grafana.
Metric types – Prometheus defines four metric families:
Counter – monotonically increasing values (e.g., request counts).
Gauge – values that can go up and down (e.g., memory usage).
Histogram – bucketed observations for distribution analysis.
Summary – pre‑computed quantiles.
Exporting custom metrics (Go) – The client_golang library makes it easy to expose metrics. Example of a simple HTTP handler:
package main
import (
"net/http"
"github.com/prometheus/client_golang/prometheus/promhttp"
)
func main() {
http.Handle("/metrics", promhttp.Handler())
http.ListenAndServe(":8080", nil)
}Defining and registering a custom counter:
myCounter := prometheus.NewCounter(prometheus.CounterOpts{
Name: "my_counter_total",
Help: "custom counter",
})
prometheus.MustRegister(myCounter)
myCounter.Add(23)Defining a histogram with explicit buckets:
myHistogram := prometheus.NewHistogram(prometheus.HistogramOpts{
Name: "my_histogram_bucket",
Help: "custom histogram",
Buckets: []float64{0.1, 0.2, 0.3, 0.4, 0.5},
})
prometheus.MustRegister(myHistogram)
myHistogram.Observe(0.3)PromQL basics – PromQL expressions are of four kinds: scalars, strings, instant vectors, and range vectors. Common functions include rate() , irate() , aggregation functions ( sum() , by() , without() ) and histogram_quantile() .
Instant query example:
go_gc_duration_seconds_countRange query example (last 5 minutes):
go_gc_duration_seconds_count[5m]Calculate QPS per endpoint:
sum(rate(http_requests_total{job="demo",method="GET",status="200"}[5m])) by (path)Grafana integration – Add Prometheus as a data source, create dashboards, and use PromQL queries in panels to visualise metrics.
Alerting with Alertmanager – Define alert rules in a YAML file, e.g. trigger when a job is down for 1 minute:
groups:
- name: simulator-alert-rule
rules:
- alert: HttpSimulatorDown
expr: sum(up{job="http_srv"}) == 0
for: 1m
labels:
severity: criticalConfigure Alertmanager to route alerts to email, Slack, etc., and optionally silence alerts via its UI.
The article also provides references, author bio, and recommended reading links.
Tencent Cloud Developer
Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.