Step-by-Step Installation and Configuration of Prometheus, Alertmanager, Node Exporter, and Grafana for Monitoring and Alerting
This guide walks through downloading, installing, configuring, and verifying Prometheus, Alertmanager, Node Exporter, and Grafana on a Linux server, including service setup, YAML configuration files, and a simple test to trigger and receive an alert via email.
Installation
Node Exporter
Download the node_exporter-0.17.0.linux-amd64.tar.gz package, extract it to /usr/local , create a node_exporter.service systemd unit, and start and enable the service.
tar zxf node_exporter-0.17.0.linux-amd64.tar.gz -C /usr/local
vim /etc/systemd/system/node_exporter.service
[Unit]
Description=node_exporter
After=network.target
[Service]
Restart=on-failure
ExecStart=/usr/local/node_exporter-0.17.0.linux-amd64/node_exporter
[Install]
WantedBy=multi-user.target
systemctl start node_exporter
systemctl status node_exporter
systemctl enable node_exporterVerify that the exporter is running.
Alertmanager
Extract the alertmanager-0.17.0.linux-amd64.tar.gz package, create a alertmanager.service unit, and start and enable the service.
tar zxf alertmanager-0.17.0.linux-amd64.tar.gz -C /usr/local
vim /etc/systemd/system/alertmanager.service
[Unit]
Description=Alertmanager
After=network-online.target
[Service]
Restart=on-failure
ExecStart=/usr/local/alertmanager-0.17.0.linux-amd64/alertmanager --config.file=/usr/local/alertmanager-0.17.0.linux-amd64/alertmanager.yml
[Install]
WantedBy=multi-user.target
systemctl start alertmanager
systemctl status alertmanager
systemctl enable alertmanager
netstat -anlpt | grep 9093Verify that Alertmanager is listening on port 9093.
Prometheus
Extract the prometheus-2.9.2.linux-amd64.tar.gz package, create a prometheus.service unit, and configure the service to use the provided YAML files.
tar zxf prometheus-2.9.2.linux-amd64.tar.gz -C /usr/local
vim /etc/systemd/system/prometheus.service
[Unit]
Description=Prometheus Server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network-online.target
[Service]
Restart=on-failure
ExecStart=/usr/local/prometheus-2.9.2.linux-amd64/prometheus --config.file=/usr/local/prometheus-2.9.2.linux-amd64/prometheus.yml --storage.tsdb.path=/var/lib/prometheus --web.external-url=http://0.0.0.0:9090
[Install]
WantedBy=multi-user.targetStart and enable the Prometheus service.
Grafana
Download the Grafana RPM from Tsinghua mirror, install it, and start and enable the Grafana server.
wget https://mirrors.tuna.tsinghua.edu.cn/grafana/yum/el7/grafana-5.4.2-1.x86_64.rpm
rpm -ivh grafana-5.4.2-1.x86_64.rpm
systemctl start grafana-server
systemctl status grafana-server
systemctl enable grafana-server
netstat -anlpt | grep 3000Verify Grafana is accessible on port 3000.
Configuration
Alertmanager
Provide an alertmanager.yml configuration that defines global SMTP settings, routing, receivers, and inhibition rules.
global:
resolve_timeout: 5m
smtp_smarthost: 'smtp.qq.com:465'
smtp_from: '[email protected]'
smtp_auth_username: '[email protected]'
smtp_auth_password: 'xxxkbpfmygbecg'
smtp_require_tls: false
route:
group_by: ['alertname']
group_wait: 10s
group_interval: 10s
repeat_interval: 1h
receiver: 'toemail'
receivers:
- name: 'toemail'
email_configs:
- to: '[email protected]'
send_resolved: true
- name: 'web.hook'
webhook_configs:
- url: 'http://127.0.0.1:5001/'
inhibit_rules:
- source_match:
severity: 'critical'
target_match:
severity: 'warning'
equal: ['alertname', 'dev', 'instance']Prometheus
Define prometheus.yml with global scrape settings, Alertmanager endpoint, rule files, and scrape jobs for Prometheus itself and the node exporter.
# my global config
global:
scrape_interval: 15s
evaluation_interval: 15s
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
- localhost:9093
# Load rules
rule_files:
- "rules/host_rules.yml"
# Scrape configs
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'my target'
static_configs:
- targets: ['localhost:9100']After reloading Prometheus, verify targets and alerts via the web UI or Grafana dashboard.
Alert Testing
Simulate node_exporter failure
Stop the node_exporter service to trigger an alert.
systemctl stop node_exporterCheck the configured email inbox to confirm that an alert notification is received.
Final Note
The steps above complete a basic monitoring and alerting setup using Prometheus, Alertmanager, Node Exporter, and Grafana. References: https://jianshu.com/p/e59cfd15612e
DevOps Cloud Academy
Exploring industry DevOps practices and technical expertise.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.