Unlock Full‑SQL Statistics in MySQL: Using Performance Schema with myawr
By enabling MySQL's Performance Schema and integrating its comprehensive SQL execution metrics into the open‑source myawr tool, this guide shows how to collect, store, and visualize full‑query statistics—beyond slow‑log—through table design, data ingestion, and multi‑dimensional reporting.
Background
When the slow‑log no longer shows performance problems because most queries have been optimized, DBAs still need a way to see the full workload, especially during cache‑miss spikes. The solution is to enable MySQL Performance Schema (PS) and feed its statement‑level statistics into the open‑source myawr tool (https://github.com/noodba/myawr) for collection and analysis.
Enable Performance Schema
Add the following line to my.cnf (or the appropriate MySQL configuration file) and restart the server:
performance_schema=ONActivate Required Consumers and Instruments
Enable the statement‑summary consumer:
UPDATE performance_schema.setup_consumers
SET ENABLED = 'YES'
WHERE NAME LIKE 'events_statements%';Enable the statement instruments and request timing information:
UPDATE performance_schema.setup_instruments
SET ENABLED = 'YES', TIMED = 'YES'
WHERE NAME LIKE 'statement%';Extend myawr to Accept PS Data
A new table myawr_snapshot_events_statements_summary_by_digest is created. It mirrors the structure of performance_schema.events_statements_summary_by_digest and adds the three identifier columns required by myawr ( id, snap_id, host_id). Key digest‑related columns include: SCHEMA_NAME – default database of the statement DIGEST – hash representing the statement pattern DIGEST_TEXT – normalized SQL text without literal values COUNT_STAR – total execution count (cumulative until server restart) SUM_TIMER_WAIT – total execution time (nanoseconds) SUM_ROWS_EXAMINED – total rows scanned SUM_ROWS_SENT – total rows returned to the client SUM_SORT_ROWS – total rows sorted (if available) FIRST_SEEN – timestamp of first execution LAST_SEEN – timestamp of most recent execution
Schema illustration:
Data ingestion filters out stale statements by using the LAST_SEEN column. Only statements executed in the last 24 hours are inserted:
SELECT *
FROM performance_schema.events_statements_summary_by_digest
WHERE LAST_SEEN > DATE_SUB(NOW(), INTERVAL 24 HOUR);The ingestion script runs periodically, writes the result set into the new table, and partitions the table by snapshot time to keep write amplification low.
Managing Data Volume
Full‑statement statistics can generate large data sets. It is recommended to partition myawr_snapshot_events_statements_summary_by_digest by day (or another suitable interval) and to tune the collection frequency so that storage cost and freshness are balanced.
Multi‑Dimensional Reporting
Total Execution Time
SELECT $tid AS host_id,
$start_snap_id,
$end_snap_id,
SUM(SUM_TIMER_WAIT) AS total_exec_time
FROM myawr_snapshot_events_statements_summary_by_digest
WHERE snap_id BETWEEN $start_snap_id AND $end_snap_id
GROUP BY $tid;Total Execution Count
SELECT $tid,
SUM(COUNT_STAR) AS total_exec_cnt
FROM myawr_snapshot_events_statements_summary_by_digest
WHERE snap_id BETWEEN $start_snap_id AND $end_snap_id
GROUP BY $tid;Total Rows Examined
SELECT $tid,
SUM(SUM_ROWS_EXAMINED) AS total_rows_examined
FROM myawr_snapshot_events_statements_summary_by_digest
WHERE snap_id BETWEEN $start_snap_id AND $end_snap_id
GROUP BY $tid;Total Rows Sent
SELECT $tid,
SUM(SUM_ROWS_SENT) AS total_rows_sent
FROM myawr_snapshot_events_statements_summary_by_digest
WHERE snap_id BETWEEN $start_snap_id AND $end_snap_id
GROUP BY $tid;Total Sort Rows
SELECT $tid,
SUM(SUM_SORT_ROWS) AS total_sort_rows
FROM myawr_snapshot_events_statements_summary_by_digest
WHERE snap_id BETWEEN $start_snap_id AND $end_snap_id
GROUP BY $tid;Each query can be visualized in the myawr HTML report, with tables sorted by the chosen metric.
Conclusion
Enabling Performance Schema, capturing its full‑statement statistics, and loading them into myawr gives DBAs a comprehensive view of MySQL workload that complements slow‑log analysis. The approach also opens the door to additional PS‑based insights such as wait events, I/O statistics, and connection metrics, while incurring modest memory and I/O overhead that should be evaluated in a test environment before production rollout.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
