How to Build a Complete MySQL Monitoring Dashboard with Prometheus and Grafana
This guide walks through deploying mysqld_exporter, configuring Prometheus and Grafana, and monitoring essential MySQL metrics such as replication health, query throughput, slow‑query counts, connection usage, and InnoDB buffer‑pool statistics, while also showing how to set up alert rules for proactive database operations.
Overview
For MySQL monitoring many solutions exist, but this article demonstrates a Prometheus + Grafana stack using the official mysqld_exporter to collect a comprehensive set of metrics from production MySQL instances.
Exporter Deployment
1. Install exporter
[root@controller2 opt]# https://github.com/prometheus/mysqld_exporter/releases/download/v0.10.0/mysqld_exporter-0.10.0.linux-amd64.tar.gz
[root@controller2 opt]# tar -xf mysqld_exporter-0.10.0.linux-amd64.tar.gz2. Add MySQL account
GRANT SELECT, PROCESS, SUPER, REPLICATION CLIENT, RELOAD ON *.* TO 'exporter'@'%' IDENTIFIED BY 'localhost';
FLUSH privileges;3. Edit configuration file
# cat /opt/mysqld_exporter-0.10.0.linux-amd64/.my.cnf
[client]
user=exporter
password=1234564. Create systemd service
# cat /etc/systemd/system/mysql_exporter.service
[Unit]
Description=mysql Monitoring System
Documentation=mysql Monitoring System
[Service]
ExecStart=/opt/mysqld_exporter-0.10.0.linux-amd64/mysqld_exporter \
-collect.info_schema.processlist \
-collect.info_schema.innodb_tablespaces \
-collect.info_schema.innodb_metrics \
-collect.perf_schema.tableiowaits \
-collect.perf_schema.indexiowaits \
-collect.perf_schema.tablelocks \
-collect.engine_innodb_status \
-collect.perf_schema.file_events \
-collect.info_schema.processlist \
-collect.binlog_size \
-collect.info_schema.clientstats \
-collect.perf_schema.eventswaits \
-config.my-cnf=/opt/mysqld_exporter-0.10.0.linux-amd64/.my.cnf
[Install]
WantedBy=multi-user.target5. Add job to Prometheus server
- job_name: 'mysql'
static_configs:
- targets: ['192.168.1.11:9104','192.168.1.12:9104']6. Test metrics endpoint
Open http://192.168.1.12:9104/metrics and verify the mysql_up gauge returns 1, indicating the exporter is reachable.
Key Monitoring Metrics
When building any monitoring solution, it is crucial to know which metrics truly reflect MySQL health. The most common categories are replication status, query throughput, slow‑query performance, connection usage, and InnoDB buffer‑pool statistics.
Replication Status
Monitor the IO and SQL threads of the slave. Both Slave_IO_Running and Slave_SQL_Running must be Yes for a healthy replica. The exporter exposes mysql_slave_status_slave_sql_running and mysql_slave_status_seconds_behind_master for these checks.
# HELP mysql_slave_status_slave_sql_running Generic metric from SHOW SLAVE STATUS.
# TYPE mysql_slave_status_slave_sql_running untyped
mysql_slave_status_slave_sql_running{channel_name="",connection_name="",master_host="172.16.1.1",master_uuid=""} 1 # HELP mysql_slave_status_seconds_behind_master Generic metric from SHOW SLAVE STATUS.
# TYPE mysql_slave_status_seconds_behind_master untyped
mysql_slave_status_seconds_behind_master{channel_name="",connection_name="",master_host="172.16.1.1",master_uuid=""} 0Query Throughput
The Questions status variable counts every client query. The exporter provides mysql_global_status_questions, which can be turned into a rate to see query growth over time.
# HELP mysql_global_status_questions Generic metric from SHOW GLOBAL STATUS.
# TYPE mysql_global_status_questions untyped
mysql_global_status_questions 13253PromQL example:
rate(mysql_global_status_questions[2m])Slow‑Query Performance
MySQL tracks slow queries via the Slow_queries variable and the long_query_time threshold. The exporter exposes mysql_global_status_slow_queries.
# HELP mysql_global_status_slow_queries Generic metric from SHOW GLOBAL STATUS.
# TYPE mysql_global_status_slow_queries untyped
mysql_global_status_slow_queries 0PromQL example:
rate(mysql_global_status_slow_queries[5m])Connection Monitoring
Key variables are max_connections, Threads_connected, Threads_running, and Aborted_connects. The exporter maps them to mysql_global_variables_max_connections, mysql_global_status_threads_connected, mysql_global_status_threads_running, and mysql_global_status_aborted_connects respectively.
# HELP mysql_global_variables_max_connections Generic gauge metric from SHOW GLOBAL VARIABLES.
# TYPE mysql_global_variables_max_connections gauge
mysql_global_variables_max_connections 151 # HELP mysql_global_status_threads_connected Generic metric from SHOW GLOBAL STATUS.
# TYPE mysql_global_status_threads_connected untyped
mysql_global_status_threads_connected 41 # HELP mysql_global_status_threads_running Generic metric from SHOW GLOBAL STATUS.
# TYPE mysql_global_status_threads_running untyped
mysql_global_status_threads_running 1 # HELP mysql_global_status_aborted_connects Generic metric from SHOW GLOBAL STATUS.
# TYPE mysql_global_status_aborted_connects untyped
mysql_global_status_aborted_connects 31InnoDB Buffer‑Pool
The buffer‑pool size, read requests, and reads from disk are critical for performance. Exporter metrics include mysql_global_variables_innodb_buffer_pool_size, mysql_global_status_innodb_buffer_pool_read_requests, and mysql_global_status_innodb_buffer_pool_reads.
# HELP mysql_global_variables_innodb_buffer_pool_size Generic gauge metric from SHOW GLOBAL VARIABLES.
# TYPE mysql_global_variables_innodb_buffer_pool_size gauge
mysql_global_variables_innodb_buffer_pool_size 1.34217728e+08 # HELP mysql_global_status_innodb_buffer_pool_read_requests Generic metric from SHOW GLOBAL STATUS.
# TYPE mysql_global_status_innodb_buffer_pool_read_requests untyped
mysql_global_status_innodb_buffer_pool_read_requests 2.7711547168e+10 # HELP mysql_global_status_innodb_buffer_pool_reads Generic metric from SHOW GLOBAL STATUS.
# TYPE mysql_global_status_innodb_buffer_pool_reads untyped
mysql_global_status_innodb_buffer_pool_reads 138Grafana Dashboard
Using the metrics above, create Grafana panels for replication health, query rate, slow‑query count, connection usage, and buffer‑pool activity. The article includes screenshots (kept as images) illustrating each panel.
Alert Rules
Define Prometheus alerting rules for critical conditions such as MySQL down, replication thread failures, high replication lag, too many connections, and excessive slow queries.
groups:
- name: MySQL-rules
rules:
- alert: MySQLStatus
expr: up == 0
for: 5s
labels:
severity: warning
annotations:
summary: "{{ $labels.instance }}: MySQL has stopped!"
description: "Detect MySQL service outage"
- alert: MySQLSlaveIOThread
expr: mysql_slave_status_slave_io_running == 0
for: 5s
labels:
severity: warning
annotations:
summary: "{{ $labels.instance }}: MySQL Slave IO Thread stopped!"
description: "Detect replication IO thread failure"
- alert: MySQLSlaveSQLThread
expr: mysql_slave_status_slave_sql_running == 0
for: 5s
labels:
severity: warning
annotations:
summary: "{{ $labels.instance }}: MySQL Slave SQL Thread stopped!"
description: "Detect replication SQL thread failure"
- alert: MySQLSlaveDelay
expr: mysql_slave_status_seconds_behind_master > 30
for: 5s
labels:
severity: warning
annotations:
summary: "{{ $labels.instance }}: Replication lag > 30s"
description: "Detect high replication delay"
- alert: MySQLTooManyConnections
expr: rate(mysql_global_status_threads_connected[5m]) > 200
for: 2m
labels:
severity: warning
annotations:
summary: "{{ $labels.instance }}: Too many MySQL connections"
description: "Current value: {{ $value }}"
- alert: MySQLTooManySlowQueries
expr: rate(mysql_global_status_slow_queries[5m]) > 3
for: 2m
labels:
severity: warning
annotations:
summary: "{{ $labels.instance }}: High slow‑query rate"
description: "Current value: {{ $value }}"Finally, add the rule files to prometheus.yml under rule_files: and reload Prometheus. The Grafana UI will then display active alerts.
Source: https://blog.51cto.com/xiaoluoge/2476375
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
