How to Build a MySQL Connection & SQL Analyzer for Fast Issue Diagnosis
This article explains how to create a lightweight MySQL tool that extracts connection details from information_schema.processlist, groups them by user, host, database, command and state, fingerprints SQL statements, and correlates them with InnoDB transaction data to quickly pinpoint performance problems.
1. Introduction
When MySQL experiences a sudden spike in active connections, simply checking SHOW PROCESSLIST or the information_schema.processlist table becomes impractical. The author built a small utility that automatically gathers connection metrics, groups them by several dimensions, and adds SQL and transaction analysis to help identify the root cause of overload.
2. Connection Analysis
The tool queries information_schema.processlist (or performance_schema.threads when available) and extracts five key dimensions: user , host , db , command , and state . The host field is trimmed to remove the port number so that connections from the same client are aggregated.
ID – thread identifier (not used for statistics)
USER – account name, used to count connections per user
HOST – client IP/hostname + port, used for client‑side aggregation
DB – default database, useful for service‑level distribution
COMMAND – current action (Sleep, Query, Connect, Statistics, etc.)
TIME – seconds in current state (used later for analysis)
STATE – detailed status string
INFO – the SQL being executed (handled in the next section)
After grouping by the five dimensions, the tool outputs a table sorted by count. The output can be filtered or reordered dynamically, allowing users to hide idle connections, slave threads, or other irrelevant sessions.
3. SQL Analysis
To understand what the active connections are doing, the INFO column is examined. Because raw SQL statements differ only by literal values, the tool normalizes them using a regular‑expression‑based fingerprint similar to pt‑query‑digest. All constant values (e.g., literals, numbers, LIMIT rows) are replaced with placeholders, turning semantically identical statements into a single canonical form.
SELECT * FROM `xxxxxxxxxxxxxxxxxxxx` `t` WHERE `t`.`ucid`='1000000020018048' LIMIT 1becomes
SELECT * FROM `xxxxxxxxxxxxxxxxxxxx` `t` WHERE `t`.`ucid`=? LIMIT ?The fingerprinted statements are then grouped and counted, producing a concise view of the most frequent query patterns.
4. Transaction Analysis
For environments where slow queries are caused by long‑running transactions, the tool joins information_schema.processlist with information_schema.INNODB_TRX on the thread ID ( processlist.id = INNODB_TRX.trx_mysql_thread_id). By subtracting trx_started from the current time, an approximate runtime for each transaction is obtained.
SELECT p.*, NOW() - t.trx_started AS runtime
FROM information_schema.processlist p,
information_schema.INNODB_TRX t
WHERE p.id = t.trx_mysql_thread_id;The aggregated results include total runtime (RT) and average runtime (AVGRT) per fingerprinted SQL, helping to surface the truly problematic statements.
5. Conclusion
The utility extracts essentially all actionable data from MySQL’s processlist, enriches it with SQL fingerprinting and transaction timing, and presents flexible grouping, filtering, and ordering options. In practice it speeds up incident triage, though future work includes handling varying LIMIT values separately and improving read‑write transaction accuracy.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
