Process‑Level Resource Monitoring and Visualization Using Python RPC, MySQL, and Grafana
This article describes a complete solution for monitoring CPU, memory, disk I/O and network usage of individual processes on Linux servers, covering the design of a Python‑based RPC collector, data aggregation, storage in MySQL, alerting, and visualization with Grafana, along with deployment and operational best‑practice notes.
Background – In environments where many instances run on a single machine, lack of per‑process monitoring makes it hard to identify which instance exhausts system resources. The goal is to collect CPU, memory, disk I/O and network metrics at the process level.
Pre‑work – Existing tools such as process_exporter require pre‑configuration and cannot monitor network usage, so an automatic discovery approach is needed.
Data collection – The implementation first tried reading /proc/<pid>/ files directly. CPU information was not available, memory is read from /proc/<pid>/status, I/O from /proc/<pid>/io, and network attempts using /proc/<pid>/net/dev proved to report whole‑interface traffic rather than per‑process traffic. The final solution falls back to standard Linux utilities ( top, ps, iotop, iftop, df, free, lscpu, etc.) and parses their output.
# Example: read memory usage of a process
$ grep "VmRSS:" /proc/3948/status
VmRSS: 19797780 kB # Example: read I/O counters of a process
$ grep "bytes" /proc/3948/io
read_bytes: 7808071458816
write_bytes: 8270093250560Network traffic per process could not be obtained from /proc, so the script records overall interface traffic using iftop and later filters by port.
Data analysis – Collected metrics are stored in dictionaries (e.g., top_dic, ps_dic, iotop_dic) and merged by PID. Process information strings are deduplicated by storing an MD5 hash.
def f_connect_mysql():
"""Establish MySQL connection"""
try:
db = pymysql.connect(...)
except Exception as e:
f_write_log(...)
db = None
return dbThe final JSON payload sent to the server looks like:
{
"19991": {"cpu":"50.0","mem":"12.5","io_r":"145","io_w":"14012","md5":"2932fb...","remarks":"/opt/soft/mysql57/bin/mysqld ..."},
"58163": {"cpu":"38.9","mem":"13.1","io_r":"16510","io_w":"1245","md5":"c9e180...","remarks":"/opt/soft/mysql57/bin/mysqld ..."}
}Storage and visualization – The server writes the data into MySQL tables (e.g., tb_monitor_process_info, tb_monitor_process_io_info) and Grafana reads these tables to render dashboards for machine‑level and process‑level metrics.
Deployment – The project is organized with a conf/config.ini file, a Python virtual environment, and scripts for server start‑up, client deployment, and MySQL initialization. Server and client communicate over SSH with password‑less login, and long MySQL connections are recommended.
Operational notes – Include SSH key setup, long‑connection usage, threshold‑based filtering to reduce metric volume, timeout handling for subprocesses (using set -o pipefail), and alerting via Enterprise WeChat bots.
Usage steps – Clone the repository, configure config.ini, initialize MySQL schema, add host entries to tb_monitor_host_config, import Grafana JSON dashboards, and schedule the server start script via cron.
Conclusion – The solution provides automated per‑process resource monitoring, scalable data collection, and visual analysis, but users should test in a staging environment and adjust thresholds, paths, and versions to match their infrastructure.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Zhuanzhuan Tech
A platform for Zhuanzhuan R&D and industry peers to learn and exchange technology, regularly sharing frontline experience and cutting‑edge topics. We welcome practical discussions and sharing; contact waterystone with any questions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
