Cloud Native 8 min read

How Prometheus Transforms Cloud‑Native Monitoring: Architecture, Data Model, and PromQL Basics

This article explains Prometheus' origins, open‑source development, CNCF graduation, core components, time‑series data model, text‑based metric protocol, powerful PromQL queries, service discovery mechanisms, and alerting practices, providing a comprehensive guide for cloud‑native observability.

Java One
Java One
Java One
How Prometheus Transforms Cloud‑Native Monitoring: Architecture, Data Model, and PromQL Basics

Origins and CNCF Graduation

Prometheus was created at SoundCloud in 2015 using Go, released as fully open source on GitHub, and acquired by Google in 2016. In 2017 Prometheus 2.0 introduced a built‑in TSDB, reducing CPU usage by 20‑40% and disk I/O and space by 33‑50% compared to version 1.8.

Core Architecture

The central component is the Prometheus server, which pulls metrics via HTTP from target services or exporters, stores them in a local TSDB, and provides a Web UI, Grafana, or PromLens for querying. It supports DNS, Kubernetes, and other service‑discovery APIs, and can forward alerts to Alertmanager. Remote storage sampling is also available.

Data Model

Prometheus stores time‑series data, each identified by a metric name and a set of key‑value labels. A sample consists of a 64‑bit timestamp and a 64‑bit floating‑point value. For example, a metric http_requests_total with labels status="200", status="404", status="500" records request counts of 8556, 20, and 68 respectively. Another metric process_open_fds (type gauge) records the number of open file descriptors, e.g., 32.

Metric Transport Protocol

Targets expose a /metrics HTTP endpoint that returns plain‑text lines. Each line starts with # to declare the metric name, type, and optional help, followed by label sets and sample values. Example: # TYPE http_requests_total counter then http_requests_total{status="500"} 68.

PromQL Query Language

Prometheus uses PromQL for querying and processing data. Sample queries include:

All requests with status 500: http_requests_total{status="500"} Average 500‑status requests over the last 5 minutes: avg_over_time(http_requests_total{status="500"}[5m]) Average 500‑status requests per second over the last 5 minutes, grouped by path: avg(rate(http_requests_total{status="500"}[5m])) by (path) Alert condition when the 5‑minute average of 500‑status requests exceeds 5% of total requests for a path:

(sum(rate(http_requests_total{status="500"}[5m])) by (path) / sum(rate(http_requests_total[5m])) by (path)) > 0.05

Service Discovery

Prometheus can discover targets via static configuration files, DNS, Kubernetes, Consul, or custom mechanisms. For short‑lived metrics, the pushgateway allows clients to push data to Prometheus for temporary storage.

Alerting

Typical alerts combine PromQL expressions with Alertmanager. An example rule triggers when the 5‑minute average of 500‑status requests exceeds a threshold, sending notifications through configured receivers.

cloud-nativeobservabilityPrometheusPromQL
Java One
Written by

Java One

Sharing common backend development knowledge.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.