Operations 10 min read

Step-by-Step Guide to Install, Configure, and Use Prometheus for Monitoring

This tutorial walks you through downloading Prometheus, setting up self‑monitoring, starting the server, opening firewall ports, exploring the built‑in UI, adding Node Exporter targets, configuring scrape jobs, creating recording rules, and visualizing metrics with queries and graphs.

Raymond Ops
Raymond Ops
Raymond Ops
Step-by-Step Guide to Install, Configure, and Use Prometheus for Monitoring

Download and Run Prometheus

wget https://github.com/prometheus/prometheus/releases/download/v2.26.0/prometheus-2.26.0.linux-amd64.tar.gz
 tar xvzf prometheus-2.26.0.linux-amd64.tar.gz
 cd prometheus-2.26.0.linux-amd64
 ls

Before starting, create a basic configuration file named prometheus.yml in the extracted directory.

Configure Prometheus Self‑Monitoring

Prometheus scrapes its own HTTP endpoint to collect metrics about its health. Save the following as prometheus.yml:

global:
  scrape_interval: 15s  # default, scrape every 15 seconds

external_labels:
  monitor: 'codelab-monitor'

scrape_configs:
  - job_name: 'prometheus'
    scrape_interval: 5s
    static_configs:
      - targets: ['localhost:9090']

Refer to the full configuration documentation for more options.

Start Prometheus

# Start Prometheus (stores data under ./data by default)
./prometheus --config.file=prometheus.yml

Visit http://localhost:9090 to view the status page and http://localhost:9090/metrics to verify the metrics endpoint.

Open Firewall Port

firewall-cmd --permanent --zone=public --add-port=9090/tcp
firewall-cmd --reload

Use the Expression Browser

Open http://localhost:9090/graph, select the Graph tab, then switch to the Table (Classic UI) tab to explore metrics such as prometheus_target_interval_length_seconds. Query the 99th‑percentile latency with:

prometheus_target_interval_length_seconds{quantile="0.99"}

Count the number of returned series with:

count(prometheus_target_interval_length_seconds)

See the expression language documentation for more details.

Use the Graph UI

Enter an expression like rate(prometheus_tsdb_head_chunks_created_total[1m]) to plot the per‑second rate of chunk creation.

Graph UI screenshot
Graph UI screenshot

Start Some Scrape Targets

Download and run Node Exporter to provide example targets:

# Download Node Exporter
wget https://github.com/prometheus/node_exporter/releases/download/v1.1.2/node_exporter-1.1.2.linux-amd64.tar.gz
# Extract
tar -xvzf node_exporter-1.1.2.linux-amd64.tar.gz
# Run on different ports
./node_exporter --web.listen-address 127.0.0.1:8001
./node_exporter --web.listen-address 127.0.0.1:8002
./node_exporter --web.listen-address 127.0.0.1:8003

These expose metrics at http://localhost:8001/metrics, http://localhost:8002/metrics, and http://localhost:8003/metrics.

Configure Prometheus to Monitor the Example Targets

Add a new job named node that scrapes the three endpoints, labeling the first two as production and the third as canary:

global:
  scrape_interval: 15s

external_labels:
  monitor: 'codelab-monitor'

scrape_configs:
  - job_name: 'prometheus'
    scrape_interval: 5s
    static_configs:
      - targets: ['10.118.71.170:9090']

  - job_name: 'node'
    scrape_interval: 5s
    static_configs:
      - targets: ['localhost:8001', 'localhost:8002']
        labels:
          group: 'production'
      - targets: ['localhost:8003']
        labels:
          group: 'canary'

After updating the configuration, restart Prometheus and view the Targets page to confirm all jobs are up.

Targets page screenshot
Targets page screenshot

Configure Recording Rules to Aggregate Data

Create a rule file prometheus.rules.yml that records the average CPU usage per instance over a 5‑minute window:

groups:
  - name: cpu-node
    rules:
      - record: job_instance_mode:node_cpu_seconds:avg_rate5m
        expr: avg by (job, instance, mode) (rate(node_cpu_seconds_total[5m]))

Reference this file from prometheus.yml by adding a rule_files section:

rule_files:
  - 'prometheus.rules.yml'

Restart Prometheus and query the new metric job_instance_mode:node_cpu_seconds:avg_rate5m in the expression browser to see the aggregated results.

Recording rule query result
Recording rule query result
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

monitoringConfigurationPrometheusRecording Rulesnode_exporter
Raymond Ops
Written by

Raymond Ops

Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.