How to Build a Full‑Stack Monitoring System with Prometheus, Grafana, and OneAlert
This guide walks you through installing Prometheus, configuring node_exporter and mysqld_exporter for remote Linux and MySQL monitoring, visualizing metrics with Grafana, and setting up multi‑level alerts using Grafana integrated with OneAlert for a robust 24/7 operations monitoring solution.
Learning Objectives
Be able to install a Prometheus server, monitor remote Linux hosts with node_exporter, monitor remote MySQL databases with mysqld_exporter, install Grafana, add Prometheus as a data source in Grafana, create CPU load graphs, display MySQL monitoring data, and implement alerts using Grafana + OneAlert.
Task Background
A fast‑growing e‑commerce company requires 24/7 business monitoring, assigning the operations team to implement the project.
Task Requirements
Deploy a monitoring server for continuous real‑time monitoring.
Design a monitoring system for business and R&D, providing reasonable suggestions for metrics and triggers.
Establish a proactive alert mechanism with strict handling procedures.
Implement multi‑level alerts: Level 1 phone, Level 2 WeChat, Level 3 email.
Handle centralized remote monitoring, using Prometheus inside K8s.
Task Analysis
Why monitor? Real‑time data collection and alerts enable quick problem detection and provide data for optimization.
Four monitoring elements:
Monitoring objects: host status, services, resources, pages, URLs.
Monitoring tool: Prometheus (instead of Zabbix).
Monitoring schedule: 7×24 or 5×8.
Alert recipients: administrators.
1. Prometheus Overview
Prometheus, written in Go, is an open‑source monitoring, alerting, and time‑series database solution, especially suited for container and Kubernetes environments. Official site: https://prometheus.io/docs/introduction/overview/
2. Time‑Series Data
What is time‑series data? It records system or device state changes over time.
Typical scenarios include autonomous vehicle telemetry, fleet tracking, real‑time stock trades, and operational monitoring.
Characteristics
High performance compared with relational databases.
Low storage cost thanks to efficient compression (≈3.5 bytes per sample).
3. Prometheus Core Features
Multi‑dimensional data model.
Flexible query language (PromQL).
Standalone server nodes (no external storage dependency).
Pull‑based data collection over HTTP (optional push via gateway).
Service discovery or static configuration.
Rich visualization options.
4. Experiment Environment Preparation
Ensure each host has a static IP, proper hostname, synchronized time, and firewalls/SELinux disabled.
5. Install Prometheus
Download the binary package from https://prometheus.io/download/ , extract, and run – no compilation needed.
Access the UI at
http://<em>server_ip</em>:9090. By default only the local host is monitored; view targets under Status → Targets .
6. Monitor Remote Linux Hosts
Install
node_exporteron the remote host (download from the Prometheus site).
Run it (use
nohupto keep it alive after logout).
Verify metrics at
http://<em>remote_ip</em>:9100/metrics.
Add the remote host to
prometheus.ymlunder
scrape_configs.
Reload Prometheus and confirm the new target appears.
7. Monitor Remote MySQL
Install
mysqld_exporteron the MySQL host (download from the Prometheus site).
Add a
mysqldscrape job to
prometheus.yml.
Verify the target appears in the Prometheus UI.
8. Grafana Visualization
Grafana is an open‑source analytics and visualization tool that can query Prometheus data and generate alerts.
Install Grafana from https://grafana.com/grafana/download and log in with the default
admin/admincredentials.
Add Prometheus as a data source, then create dashboards for CPU load and MySQL metrics.
9. Grafana + OneAlert Alerting
Instead of writing Alertmanager rules, use OneAlert integrated with Grafana.
Add Grafana as an application in OneAlert.
Configure notification policies (phone, WeChat, email).
Create a notification channel in Grafana pointing to OneAlert.
Define a test alert (e.g., CPU load) and verify it triggers.
10. Common Alert Failures
Time synchronization issues across servers.
Missing notification content.
Not saving the alert configuration.
Alert state not transitioning to
alerting.
Communication problems between Grafana and OneAlert.
Open Source Linux
Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.