Databases 11 min read

Unlock Full‑SQL Statistics in MySQL: Using Performance Schema with myawr

By enabling MySQL's Performance Schema and integrating its comprehensive SQL execution metrics into the open‑source myawr tool, this guide shows how to collect, store, and visualize full‑query statistics—beyond slow‑log—through table design, data ingestion, and multi‑dimensional reporting.

dbaplus Community
dbaplus Community
dbaplus Community
Unlock Full‑SQL Statistics in MySQL: Using Performance Schema with myawr

Background

When the slow‑log no longer shows performance problems because most queries have been optimized, DBAs still need a way to see the full workload, especially during cache‑miss spikes. The solution is to enable MySQL Performance Schema (PS) and feed its statement‑level statistics into the open‑source myawr tool (https://github.com/noodba/myawr) for collection and analysis.

Enable Performance Schema

Add the following line to my.cnf (or the appropriate MySQL configuration file) and restart the server:

performance_schema=ON

Activate Required Consumers and Instruments

Enable the statement‑summary consumer:

UPDATE performance_schema.setup_consumers
SET ENABLED = 'YES'
WHERE NAME LIKE 'events_statements%';

Enable the statement instruments and request timing information:

UPDATE performance_schema.setup_instruments
SET ENABLED = 'YES', TIMED = 'YES'
WHERE NAME LIKE 'statement%';

Extend myawr to Accept PS Data

A new table myawr_snapshot_events_statements_summary_by_digest is created. It mirrors the structure of performance_schema.events_statements_summary_by_digest and adds the three identifier columns required by myawr ( id, snap_id, host_id). Key digest‑related columns include: SCHEMA_NAME – default database of the statement DIGEST – hash representing the statement pattern DIGEST_TEXT – normalized SQL text without literal values COUNT_STAR – total execution count (cumulative until server restart) SUM_TIMER_WAIT – total execution time (nanoseconds) SUM_ROWS_EXAMINED – total rows scanned SUM_ROWS_SENT – total rows returned to the client SUM_SORT_ROWS – total rows sorted (if available) FIRST_SEEN – timestamp of first execution LAST_SEEN – timestamp of most recent execution

Schema illustration:

myawr_snapshot_events_statements_summary_by_digest schema
myawr_snapshot_events_statements_summary_by_digest schema

Data ingestion filters out stale statements by using the LAST_SEEN column. Only statements executed in the last 24 hours are inserted:

SELECT *
FROM performance_schema.events_statements_summary_by_digest
WHERE LAST_SEEN > DATE_SUB(NOW(), INTERVAL 24 HOUR);

The ingestion script runs periodically, writes the result set into the new table, and partitions the table by snapshot time to keep write amplification low.

Managing Data Volume

Full‑statement statistics can generate large data sets. It is recommended to partition myawr_snapshot_events_statements_summary_by_digest by day (or another suitable interval) and to tune the collection frequency so that storage cost and freshness are balanced.

Multi‑Dimensional Reporting

Total Execution Time

SELECT $tid AS host_id,
       $start_snap_id,
       $end_snap_id,
       SUM(SUM_TIMER_WAIT) AS total_exec_time
FROM myawr_snapshot_events_statements_summary_by_digest
WHERE snap_id BETWEEN $start_snap_id AND $end_snap_id
GROUP BY $tid;

Total Execution Count

SELECT $tid,
       SUM(COUNT_STAR) AS total_exec_cnt
FROM myawr_snapshot_events_statements_summary_by_digest
WHERE snap_id BETWEEN $start_snap_id AND $end_snap_id
GROUP BY $tid;

Total Rows Examined

SELECT $tid,
       SUM(SUM_ROWS_EXAMINED) AS total_rows_examined
FROM myawr_snapshot_events_statements_summary_by_digest
WHERE snap_id BETWEEN $start_snap_id AND $end_snap_id
GROUP BY $tid;

Total Rows Sent

SELECT $tid,
       SUM(SUM_ROWS_SENT) AS total_rows_sent
FROM myawr_snapshot_events_statements_summary_by_digest
WHERE snap_id BETWEEN $start_snap_id AND $end_snap_id
GROUP BY $tid;

Total Sort Rows

SELECT $tid,
       SUM(SUM_SORT_ROWS) AS total_sort_rows
FROM myawr_snapshot_events_statements_summary_by_digest
WHERE snap_id BETWEEN $start_snap_id AND $end_snap_id
GROUP BY $tid;

Each query can be visualized in the myawr HTML report, with tables sorted by the chosen metric.

Conclusion

Enabling Performance Schema, capturing its full‑statement statistics, and loading them into myawr gives DBAs a comprehensive view of MySQL workload that complements slow‑log analysis. The approach also opens the door to additional PS‑based insights such as wait events, I/O statistics, and connection metrics, while incurring modest memory and I/O overhead that should be evaluated in a test environment before production rollout.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

mysqlPerformance SchemaSQL statisticsmyawr
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.