Process‑Level Resource Monitoring and Visualization Using Python RPC, MySQL, and Grafana
This article describes a complete solution for monitoring CPU, memory, disk I/O and network usage of individual processes on Linux servers, covering the design of a Python‑based RPC collector, data aggregation, storage in MySQL, alerting, and visualization with Grafana, along with deployment and operational best‑practice notes.
Background – In environments where many instances run on a single machine, lack of per‑process monitoring makes it hard to identify which instance exhausts system resources. The goal is to collect CPU, memory, disk I/O and network metrics at the process level.
Pre‑work – Existing tools such as process_exporter require pre‑configuration and cannot monitor network usage, so an automatic discovery approach is needed.
Data collection – The implementation first tried reading /proc/<pid>/ files directly. CPU information was not available, memory is read from /proc/<pid>/status , I/O from /proc/<pid>/io , and network attempts using /proc/<pid>/net/dev proved to report whole‑interface traffic rather than per‑process traffic. The final solution falls back to standard Linux utilities ( top , ps , iotop , iftop , df , free , lscpu , etc.) and parses their output.
# Example: read memory usage of a process
$ grep "VmRSS:" /proc/3948/status
VmRSS: 19797780 kB # Example: read I/O counters of a process
$ grep "bytes" /proc/3948/io
read_bytes: 7808071458816
write_bytes: 8270093250560Network traffic per process could not be obtained from /proc , so the script records overall interface traffic using iftop and later filters by port.
Data analysis – Collected metrics are stored in dictionaries (e.g., top_dic , ps_dic , iotop_dic ) and merged by PID. Process information strings are deduplicated by storing an MD5 hash.
def f_connect_mysql():
"""Establish MySQL connection"""
try:
db = pymysql.connect(...)
except Exception as e:
f_write_log(...)
db = None
return dbThe final JSON payload sent to the server looks like:
{
"19991": {"cpu":"50.0","mem":"12.5","io_r":"145","io_w":"14012","md5":"2932fb...","remarks":"/opt/soft/mysql57/bin/mysqld ..."},
"58163": {"cpu":"38.9","mem":"13.1","io_r":"16510","io_w":"1245","md5":"c9e180...","remarks":"/opt/soft/mysql57/bin/mysqld ..."}
}Storage and visualization – The server writes the data into MySQL tables (e.g., tb_monitor_process_info , tb_monitor_process_io_info ) and Grafana reads these tables to render dashboards for machine‑level and process‑level metrics.
Deployment – The project is organized with a conf/config.ini file, a Python virtual environment, and scripts for server start‑up, client deployment, and MySQL initialization. Server and client communicate over SSH with password‑less login, and long MySQL connections are recommended.
Operational notes – Include SSH key setup, long‑connection usage, threshold‑based filtering to reduce metric volume, timeout handling for subprocesses (using set -o pipefail ), and alerting via Enterprise WeChat bots.
Usage steps – Clone the repository, configure config.ini , initialize MySQL schema, add host entries to tb_monitor_host_config , import Grafana JSON dashboards, and schedule the server start script via cron.
Conclusion – The solution provides automated per‑process resource monitoring, scalable data collection, and visual analysis, but users should test in a staging environment and adjust thresholds, paths, and versions to match their infrastructure.
Zhuanzhuan Tech
A platform for Zhuanzhuan R&D and industry peers to learn and exchange technology, regularly sharing frontline experience and cutting‑edge topics. We welcome practical discussions and sharing; contact waterystone with any questions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.