How Prometheus Works: A Visual Deep‑Dive into Architecture, Metrics, and Alerting
This article visually dissects Prometheus, explaining its architecture, core features, data collection methods, exporter role, PromQL query language, and alerting workflow, while contrasting it with ELK and highlighting practical configuration examples for real‑world monitoring.
1. What is Prometheus?
Prometheus is an open‑source monitoring system that collects time‑series data directly from applications, evaluates it with a powerful rule engine, and provides alerting capabilities. Unlike the ELK stack, which is designed for log collection and long‑term storage, Prometheus focuses on recent trend data and typically retains metrics for 15 days.
The official Prometheus architecture diagram shows a pull‑based model where the server scrapes metrics from targets, stores them locally in a TSDB, and optionally forwards them to external storage.
Prometheus Features
A fully open‑source monitoring tool.
Built on a time‑series database (TSDB) implemented in Go.
Originated from SoundCloud’s internal borgmon project.
Supports multi‑dimensional labels for flexible querying.
Uses a pull‑based data collection model.
Provides both white‑box and black‑box monitoring, making it DevOps‑friendly.
Offers a dedicated Metrics & Alerting model, distinct from logging or tracing.
Rich ecosystem with exporters in many languages.
High single‑node performance: can ingest millions of time‑series and handle thousands of targets.
Prometheus Limitations
Designed primarily for performance and availability metrics; not suitable for logs, events, or tracing.
Retention defaults to 15 days, focusing on recent data.
2. Metric Collection in Prometheus
The Web UI shows the list of Targets and their Endpoints , indicating which services are currently being scraped.
Endpoint : the URL exposing metrics.
Target : the host/port combination that Prometheus will scrape.
Example scrape configuration (YAML) for a MySQL exporter:
- job_name: mysqld
static_configs:
- targets: ['192.168.0.100:9104']
labels:
instance: mysql-exporterJob : groups targets with the same role.
Instance : identifies a specific exporter instance.
Scraped metrics are stored as time‑series on the Prometheus server and can be optionally forwarded to external storage solutions.
3. Data Collection Methods
Prometheus supports two collection approaches: direct (pull) and indirect (via Exporter) .
Direct collection involves instrumenting your own application with a Prometheus client library that exposes a /metrics endpoint. Projects like etcd, Kubernetes, and Docker already provide such endpoints.
Indirect collection is used for black‑box systems (e.g., OS, Redis, MySQL) where you cannot modify the source. An Exporter runs alongside the target, translates its internal metrics into the Prometheus text format, and exposes them for scraping.
4. Exporter Role
An Exporter acts as a sidecar or agent that converts internal metrics of a black‑box system into a format Prometheus understands, then serves them over HTTP.
Common exporters include node‑exporter for OS metrics and mysql‑exporter for MySQL. The node‑exporter runs on a Linux host, exposing CPU, memory, and disk statistics; the MySQL exporter can be deployed on a separate machine and still scrape the MySQL instance.
The official list of exporters is available at:
https://prometheus.io/docs/instrumenting/exporters/5. PromQL – The Query Language
PromQL resembles SQL but is a domain‑specific language for selecting and aggregating time‑series data. It enables real‑time queries, aggregations, and calculations directly against the stored metrics.
PromQL can be used via the Prometheus Web UI, Grafana dashboards, or API clients.
Grafana is often paired with Prometheus to visualize metrics.
6. Alerting Workflow
Sending Alerts
When a Prometheus rule evaluates to true, an alert is generated and sent to the separate Alertmanager component. Alertmanager groups, routes, and forwards alerts to receivers such as email, DingTalk, etc.
Metrics collection and alert evaluation are decoupled in Prometheus.
Prometheus server periodically evaluates alert rules defined in PromQL.
On a match, an alert is pushed to Alertmanager.
Alertmanager groups alerts, applies routing logic, and delivers them to configured receivers.
7. Summary
Through a series of diagrams and concrete examples, the article covered Prometheus’s advantages and drawbacks, metric collection mechanisms, direct vs. indirect scraping, the role of exporters, the PromQL query language, and the end‑to‑end alerting pipeline, providing a practical foundation for building cloud‑native monitoring solutions.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
