Step-by-Step Guide to Install, Configure, and Use Prometheus for Monitoring
This tutorial walks you through downloading Prometheus, setting up self‑monitoring, starting the server, opening firewall ports, exploring the built‑in UI, adding Node Exporter targets, configuring scrape jobs, creating recording rules, and visualizing metrics with queries and graphs.
Download and Run Prometheus
wget https://github.com/prometheus/prometheus/releases/download/v2.26.0/prometheus-2.26.0.linux-amd64.tar.gz
tar xvzf prometheus-2.26.0.linux-amd64.tar.gz
cd prometheus-2.26.0.linux-amd64
lsBefore starting, create a basic configuration file named prometheus.yml in the extracted directory.
Configure Prometheus Self‑Monitoring
Prometheus scrapes its own HTTP endpoint to collect metrics about its health. Save the following as prometheus.yml:
global:
scrape_interval: 15s # default, scrape every 15 seconds
external_labels:
monitor: 'codelab-monitor'
scrape_configs:
- job_name: 'prometheus'
scrape_interval: 5s
static_configs:
- targets: ['localhost:9090']Refer to the full configuration documentation for more options.
Start Prometheus
# Start Prometheus (stores data under ./data by default)
./prometheus --config.file=prometheus.ymlVisit http://localhost:9090 to view the status page and http://localhost:9090/metrics to verify the metrics endpoint.
Open Firewall Port
firewall-cmd --permanent --zone=public --add-port=9090/tcp
firewall-cmd --reloadUse the Expression Browser
Open http://localhost:9090/graph, select the Graph tab, then switch to the Table (Classic UI) tab to explore metrics such as prometheus_target_interval_length_seconds. Query the 99th‑percentile latency with:
prometheus_target_interval_length_seconds{quantile="0.99"}Count the number of returned series with:
count(prometheus_target_interval_length_seconds)See the expression language documentation for more details.
Use the Graph UI
Enter an expression like rate(prometheus_tsdb_head_chunks_created_total[1m]) to plot the per‑second rate of chunk creation.
Start Some Scrape Targets
Download and run Node Exporter to provide example targets:
# Download Node Exporter
wget https://github.com/prometheus/node_exporter/releases/download/v1.1.2/node_exporter-1.1.2.linux-amd64.tar.gz
# Extract
tar -xvzf node_exporter-1.1.2.linux-amd64.tar.gz
# Run on different ports
./node_exporter --web.listen-address 127.0.0.1:8001
./node_exporter --web.listen-address 127.0.0.1:8002
./node_exporter --web.listen-address 127.0.0.1:8003These expose metrics at http://localhost:8001/metrics, http://localhost:8002/metrics, and http://localhost:8003/metrics.
Configure Prometheus to Monitor the Example Targets
Add a new job named node that scrapes the three endpoints, labeling the first two as production and the third as canary:
global:
scrape_interval: 15s
external_labels:
monitor: 'codelab-monitor'
scrape_configs:
- job_name: 'prometheus'
scrape_interval: 5s
static_configs:
- targets: ['10.118.71.170:9090']
- job_name: 'node'
scrape_interval: 5s
static_configs:
- targets: ['localhost:8001', 'localhost:8002']
labels:
group: 'production'
- targets: ['localhost:8003']
labels:
group: 'canary'After updating the configuration, restart Prometheus and view the Targets page to confirm all jobs are up.
Configure Recording Rules to Aggregate Data
Create a rule file prometheus.rules.yml that records the average CPU usage per instance over a 5‑minute window:
groups:
- name: cpu-node
rules:
- record: job_instance_mode:node_cpu_seconds:avg_rate5m
expr: avg by (job, instance, mode) (rate(node_cpu_seconds_total[5m]))Reference this file from prometheus.yml by adding a rule_files section:
rule_files:
- 'prometheus.rules.yml'Restart Prometheus and query the new metric job_instance_mode:node_cpu_seconds:avg_rate5m in the expression browser to see the aggregated results.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Raymond Ops
Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
