Master Prometheus: From Basics to Full-Scale Monitoring Deployment
This guide walks through Prometheus fundamentals, architecture, components, service discovery, Docker-based deployment, exporter integration, Alertmanager configuration, Grafana visualization, PromQL queries, and Consul service discovery, providing a complete end‑to‑end monitoring solution for cloud‑native environments.
Prometheus Overview
Prometheus is an open‑source monitoring and alerting system with a time‑series database, originally developed by SoundCloud and now a CNCF project.
Implemented in Go, Prometheus pulls metrics via HTTP from exporters, supports pull‑based data collection, and can monitor thousands of nodes.
System Architecture
Basic Principle
Prometheus periodically scrapes HTTP endpoints (exporters) exposed by monitored components; no SDK is required. Common exporters exist for Varnish, HAProxy, Nginx, MySQL, and system metrics.
Workflow:
Prometheus server scrapes metrics from configured jobs or exporters, or receives them from Pushgateway or other Prometheus servers.
Metrics are stored locally and alert rules are evaluated; alerts are sent to Alertmanager.
Alertmanager processes alerts (deduplication, grouping, routing) and sends notifications.
Grafana visualizes the collected data.
Key Features
Multi‑dimensional data model.
Powerful query language (PromQL).
Standalone server without external storage dependencies.
HTTP‑based pull collection.
Pushgateway for metric pushes.
Service discovery or static configuration.
Rich visualizations via Grafana and other tools.
Components
Prometheus Server – data collection, storage, and PromQL support.
Alertmanager – handles alerts.
Pushgateway – intermediate gateway for push metrics.
Exporters – expose component metrics over HTTP.
Grafana – web UI for dashboards.
Service Discovery
Because Prometheus pulls metrics, static target lists become cumbersome; service discovery (SD) mechanisms (e.g., Azure, Consul, DNS, EC2, Kubernetes, etc.) automate target discovery. In this guide static configuration is used.
Deploying Prometheus Server
1. Using the Official Image
Create prometheus.yml and rules.yml locally, then run:
$ docker run -d -p 9090:9090 --name=prometheus \
-v /root/prometheus/conf/:/etc/prometheus/ \
prom/prometheus2. Building a Custom Image
Pull the base image and unpack the binary package: $ docker pull zhanganmin2017/prometheus:v2.9.0 Directory layout:
prometheus-2.9.0/
├── conf
│ ├── CentOS7-Base-163.repo
│ ├── container-entrypoint
│ ├── epel-7.repo
│ ├── prometheus-start.conf
│ ├── prometheus-start.sh
│ ├── prometheus.yml
│ ├── rules
│ │ └── service_down.yml
│ └── supervisord.conf
├── Dockerfile
└── package
├── console_libraries
├── consoles
├── LICENSE
├── NOTICE
├── prometheus
├── prometheus.yml
└── promtoolCreate prometheus-start.sh to launch Prometheus via Supervisor, and a prometheus-start.conf for Supervisor configuration.
#!/bin/bash
/bin/prometheus \
--config.file=/data/prometheus/prometheus.yml \
--storage.tsdb.path=/data/prometheus/data \
--web.console.libraries=/data/prometheus/console_libraries \
--web.enable-lifecycle \
--web.console.templates=/data/prometheus/consolesSupervisor configuration ( prometheus-start.conf) defines how the process is started.
[program:prometheus]
command=sh /etc/supervisord.d/prometheus-start.sh
autostart=false
startsecs=10
autorestart=false
startretries=0
user=root
redirect_stderr=true
stdout_logfile=/data/prometheus/prometheus.log
stopasgroup=true
killasgroup=trueDockerfile (simplified):
FROM centos:7
MAINTAINER [email protected]
RUN rm -rf /etc/yum.repos.d/*.repo
ADD conf/CentOS7-Base-163.repo /etc/yum.repos.d/
ADD conf/epel-7.repo /etc/yum.repos.d/
RUN yum install -y openssh-server openssh-clients net-tools vim supervisor && yum clean all
RUN ssh-keygen -q -N "" -t rsa -f /etc/ssh/ssh_host_rsa_key && \
ssh-keygen -q -N "" -t ecdsa -f /etc/ssh/ssh_host_ecdsa_key && \
ssh-keygen -q -N "" -t ed25519 -f /etc/ssh/ssh_host_ed25519_key && \
sed -i 's/#UseDNS yes/UseDNS no/g' /etc/ssh/sshd_config
ENV LANG=zh_CN.UTF-8
RUN echo "export LANG=zh_CN.UTF-8" >> /etc/profile.d/lang.sh && \
ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime && \
localedef -c -f UTF-8 -i zh_CN zh_CN.utf8
COPY package/prometheus /bin/prometheus
COPY package/promtool /bin/promtool
COPY package/console_libraries/ /usr/local/src/console_libraries/
COPY package/consoles/ /usr/local/src/consoles/
COPY conf/prometheus.yml /usr/local/src/prometheus.yml
COPY conf/rules/ /usr/local/src/rules/
RUN echo "root:123456" | chpasswd
ADD conf/supervisord.conf /etc/supervisord.conf
ADD conf/prometheus-start.conf /etc/supervisord.d/prometheus-start.conf
ADD conf/container-entrypoint /container-entrypoint
ADD conf/prometheus-start.sh /etc/supervisord.d/prometheus-start.sh
RUN chmod +x /container-entrypoint
CMD ["/container-entrypoint"]Build and run:
$ docker build -t zhanganmin2017/prometheus:v2.9.0 .
$ docker run -itd -h prometheus139-210 -m 8g \
--cpuset-cpus=28-31 --name=prometheus139-210 \
--network trust139 --ip=10.1.133.28 \
-v /data/works/prometheus139-210:/data \
192.168.166.229/1an/prometheus:v2.9.0
$ docker exec -it prometheus139-210 /bin/bash
$ supervisorctl start prometheusAccess the UI at IP:9090.
Deploying Exporters
1. Host Monitoring (node‑exporter)
Run node‑exporter in host network mode (Docker container not recommended):
$ docker run -d \
--net="host" \
--pid="host" \
-v "/:/host:ro,rslave" \
quay.io/prometheus/node-exporter \
--path.rootfs=/hostAdd the target to prometheus.yml and reload.
2. Container Monitoring (cadvisor‑exporter)
# docker run -d -h cadvisor139-216 --name=cadvisor139-216 --net=none -m 8g \
--cpus=4 --ip=10.1.139.216 \
--volume=/:/rootfs:ro \
--volume=/var/run:/var/run:rw \
--volume=/sys:/sys:ro \
--volume=/var/lib/docker/:/var/lib/docker:ro \
--volume=/dev/disk/:/dev/disk:ro \
google/cadvisor:latestAdd the cadvisor job to prometheus.yml and reload.
3. Redis Monitoring (redis‑exporter)
$ docker run -d -h redis_exporter139-218 --name redis_exporter139-218 \
--network trust139 --ip=10.1.139.218 -m 8g -p 9121:9121 \
oliver006/redis_exporter --redis.passwd 123456Configure the job in prometheus.yml and reload.
4. Application Monitoring (jmx‑exporter)
Download jmx_prometheus_javaagent-0.11.0.jar and a suitable config file, then add to JVM startup:
CATALINA_OPTS="-javaagent:/app/tomcat-8.5.23/lib/jmx_prometheus_javaagent-0.11.0.jar=12345:/app/tomcat-8.5.23/conf/config.yaml"Add the 12345 port as a target in prometheus.yml.
5. Process Monitoring (process‑exporter)
$ wget https://github.com/ncabatoff/process-exporter/releases/download/v0.5.0/process-exporter-0.5.0.linux-amd64.tar.gz
$ tar -xzvf process-exporter-0.5.0.linux-amd64.tar.gz
# process-name.yaml example
process_names:
- name: "{{.Matches}}"
cmdline:
- 'redis-shake'
$ ./process-exporter -config.path process-name.yaml &Add the exporter (port 9256) to prometheus.yml and reload.
Deploying Alertmanager
1. Overview
Alertmanager receives alerts from Prometheus, deduplicates, groups, routes, silences, and forwards them to receivers such as email, WeChat, PagerDuty, etc.
2. Configuration
global:
resolve_timeout: 2m
smtp_smarthost: smtp.163.com:25
smtp_from: [email protected]
smtp_auth_username: [email protected]
smtp_auth_password: zxxx
templates:
- '/data/alertmanager/template/wechat.tmpl'
route:
group_by: ['alertname_wechat']
group_wait: 1s
group_interval: 1s
receiver: 'wechat'
repeat_interval: 1h
routes:
- receiver: wechat
match_re:
severity: wechat
receivers:
- name: 'email'
email_configs:
- to: '[email protected]'
send_resolved: true
- name: 'wechat'
wechat_configs:
- corp_id: 'wwd402ce40b4720f24'
to_party: '2'
agent_id: '1000002'
api_secret: '9nmYa4p12OkToCbh_oNc'
send_resolved: trueRun Alertmanager container:
$ docker run -d -p 9093:9093 --name alertmanager \
-m 8g --cpus=4 \
-v /opt/alertmanager.yml:/etc/alertmanager/alertmanager.yml \
-v /opt/template:/etc/alertmanager/template \
prom/alertmanager:latestAccess UI at IP:9093.
Alert Rules (PromQL)
Example host‑monitoring rule ( host_sys.yml):
groups:
- name: Host
rules:
- alert: HostMemoryUsage
expr: (node_memory_MemTotal_bytes - (node_memory_MemFree_bytes + node_memory_Buffers_bytes + node_memory_Cached_bytes)) / node_memory_MemTotal_bytes * 100 > 90
for: 1m
labels:
name: Memory
severity: Warning
annotations:
summary: "{{ $labels.appname }}"
description: "Host memory usage exceeds 90%."
value: "{{ $value }}"
# Additional CPU, Load, Disk, DiskIO, Network rules omitted for brevitySimilar rule files are created for containers, Redis, and process monitoring.
Grafana Visualization
Run Grafana container:
$ docker run -d -h grafana139-211 -m 8g \
--network trust139 --ip=10.2.139.211 \
--cpus=4 --name=grafana139-211 \
-e "GF_SERVER_ROOT_URL=http://10.2.139.211" \
-e "GF_SECURITY_ADMIN_PASSWORD=passwd" \
grafana/grafanaAccess at IP:3000 (user: admin, password: passwd). Add Prometheus as a data source and import dashboards (e.g., Node‑exporter 8919, Cadvisor 193, JMX‑exporter 8563, Redis‑exporter 2751, Process‑exporter 249).
Consul Service Discovery
Deploy a Consul cluster using Docker (3 servers, 1 client). Register services via HTTP API, e.g.:
curl -X PUT -d '{"id":"192.168.16.173","name":"node-exporter","address":"192.168.16.173","port":9100,"tags":["DEV"],"checks":[{"http":"http://192.168.16.173:9100/","interval":"5s"}]}' http://172.17.0.4:8500/v1/agent/service/registerPrometheus configuration for Consul SD:
- job_name: 'consul'
consul_sd_configs:
- server: '192.168.16.173:8900'
services: []
relabel_configs:
- source_labels: [__meta_consul_service]
regex: 'consul'
action: drop
- source_labels: [__meta_consul_service]
target_label: appname
- source_labels: [__meta_consul_service_address]
target_label: instance
- source_labels: [__meta_consul_tags]
target_label: jobReload Prometheus and verify discovered targets in the UI.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
