Databases 17 min read

Is MySQL CPU Spike a Database Issue or an Application Issue? Troubleshooting Guide

When MySQL CPU usage spikes above 80% or hits 100%, this guide walks you through a systematic investigation—from confirming the MySQL process consumes CPU, checking system and MySQL status, analyzing connection counts, slow queries, lock waits, and configuration settings, to applying short‑term mitigations and long‑term architectural fixes.

Ops Community
Ops Community
Ops Community
Is MySQL CPU Spike a Database Issue or an Application Issue? Troubleshooting Guide

Background and Problem

MySQL CPU usage soaring above 80% or even reaching 100% is a common emergency in production. If the alert triggers, business response slows or times out, and rapid root‑cause identification is essential to avoid service outage.

Operators often fall into two extremes: blaming the database and blindly adding hardware or restarting, or blaming the application and demanding immediate SQL optimizations. Both are unprofessional; the real cause can be misconfiguration, a single slow query, excessive concurrent connections, lock‑wait spin, or legitimate traffic growth.

This article examines the issue from system tools, MySQL status, and query analysis perspectives.

1. Discover and Confirm the Problem

1.1 Verify CPU Usage at the System Level

First ensure the MySQL process is actually consuming CPU, ruling out other processes:

# View MySQL process CPU usage
 top -bn1 | grep mysql
 ps aux | grep mysql

# View CPU usage trend for MySQL
 sar -u 1 10

# View per‑CPU core usage
 mpstat -P ALL 1 5

# View system load
 uptime

If MySQL's CPU share is low, the problem likely lies elsewhere (e.g., web or app servers). If high, continue the investigation.

1.2 Quick Internal State Check

Log into MySQL and run quick status queries:

# Show current connections
 SHOW STATUS LIKE 'Threads_connected';
 SHOW STATUS LIKE 'Max_used_connections';

# Show currently running queries
 SHOW STATUS LIKE 'Threads_running';

# Show query cache hit rate (MySQL 5.7 and earlier)
 SHOW STATUS LIKE 'Qcache%';

# Show InnoDB buffer pool status
 SHOW STATUS LIKE 'Innodb_buffer_pool%';

# Show lock wait information
 SHOW ENGINE INNODB STATUS\G

# Show transaction and lock details
 SELECT * FROM information_schema.INNODB_TRX;
 SELECT * FROM information_schema.INNODB_LOCK_WAITS;
Threads_running

is a key indicator; values consistently exceeding 2‑3× the number of CPU cores signal overload.

2. Concurrent Connection Issues

2.1 Too Many Connections

High concurrency itself consumes CPU:

# List all connections
 SHOW PROCESSLIST;
 SHOW FULL PROCESSLIST;

# Count connections per user
 SELECT user, COUNT(*) FROM information_schema.processlist GROUP BY user;

# Count connections per host
 SELECT host, COUNT(*) FROM information_schema.processlist GROUP BY host;

# View max_connections setting
 SHOW VARIABLES LIKE 'max_connections';

# View connection‑related memory usage
 SHOW STATUS LIKE 'Connection_errors%';

Solution:

# Temporarily raise max_connections
 SET GLOBAL max_connections = 2000;

# Kill idle connections (Sleep > 1h)
 SELECT CONCAT('KILL ', id, ';') FROM information_schema.processlist WHERE Command='Sleep' AND Time > 3600 INTO OUTFILE '/tmp/kill_connections.sql';
 SOURCE /tmp/kill_connections.sql;

# Persist configuration
 [mysqld]
 max_connections = 2000
 wait_timeout = 600
 interactive_timeout = 600

2.2 Insufficient Connection Reuse

If the application does not use a connection pool, each request creates a new connection, wasting CPU on handshakes:

# Check connection creation rate
 SHOW STATUS LIKE 'Connections';

# Check aborted connections
 SHOW STATUS LIKE 'Aborted%';

# Average new connections per second
 SHOW STATUS LIKE 'Connections' LIKE '%s';

3. Slow Queries Causing High CPU

3.1 Identify the Most CPU‑Intensive Queries

# List currently executing queries
 SHOW FULL PROCESSLIST;

# Sort by execution time, show top 10
 SELECT ID, USER, HOST, DB, COMMAND, TIME, STATE, LEFT(INFO,100) AS query
 FROM information_schema.processlist
 WHERE COMMAND!='Sleep'
 ORDER BY TIME DESC
 LIMIT 10;

3.2 Use Performance Schema (MySQL 5.6+)

Enable relevant instruments and consumers, then query the summary:

# Enable statement monitoring
 UPDATE performance_schema.setup_instruments SET ENABLED='YES' WHERE NAME LIKE 'statement/%';

# Enable event consumers
 UPDATE performance_schema.setup_consumers SET ENABLED='YES' WHERE NAME LIKE 'events_statements%';

# Show top CPU‑consuming statements (MySQL 8.0+)
 SELECT SCHEMA_NAME, DIGEST, COUNT_STAR,
        SUM_TIMER_WAIT/1000000000000 AS total_seconds,
        AVG_TIMER_WAIT/1000000000000 AS avg_seconds,
        SUM_ROWS_EXAMINED, SUM_ROWS_SENT,
        SUBSTR(DIGEST_TEXT,1,200) AS query
 FROM performance_schema.events_statements_summary_by_digest
 WHERE SCHEMA_NAME='your_database'
 ORDER BY SUM_TIMER_WAIT DESC
 LIMIT 10;

3.3 Common CPU‑Heavy Query Patterns

Full table scans are the most frequent CPU killers:

# Find possible full scans
 EXPLAIN SELECT * FROM orders WHERE status='pending';

# If type is ALL, create an index
 CREATE INDEX idx_orders_status ON orders(status);

Sorting operations also consume CPU; identify queries that trigger filesort:

# Find queries with filesort
 SELECT COUNT(*) AS execution_count,
        SUM_TIMER_WAIT/1000000000000 AS total_time,
        AVG_TIMER_WAIT/1000000000000 AS avg_time,
        SUBSTR(DIGEST_TEXT,1,100) AS query
 FROM performance_schema.events_statements_summary_by_digest
 WHERE DIGEST_TEXT LIKE '%ORDER BY%'
 ORDER BY SUM_TIMER_WAIT DESC
 LIMIT 10;

4. Lock‑Wait Issues

4.1 Lock Waits Cause CPU Spin

# Show current lock waits
 SELECT r.trx_id AS waiting_trx_id, r.trx_mysql_thread_id AS waiting_thread,
        r.trx_query AS waiting_query, b.trx_id AS blocking_trx_id,
        b.trx_mysql_thread_id AS blocking_thread, b.trx_query AS blocking_query,
        b.trx_started AS blocking_started, b.trx_rows_locked AS blocking_rows_locked
 FROM information_schema.INNODB_LOCK_WAITS w
 JOIN information_schema.INNODB_TRX b ON b.trx_id = w.blocking_trx_id
 JOIN information_schema.INNODB_TRX r ON r.trx_id = w.requesting_trx_id;

4.2 Long‑Running Transactions

# Find transactions running long
 SELECT trx_id, trx_mysql_thread_id, trx_state, trx_started,
        TIMESTAMPDIFF(SECOND, trx_started, NOW()) AS running_seconds,
        trx_rows_locked, trx_query
 FROM information_schema.INNODB_TRX
 ORDER BY trx_started;

# Find transactions longer than 10 minutes
 SELECT trx_mysql_thread_id, trx_started,
        TIMESTAMPDIFF(MINUTE, trx_started, NOW()) AS running_minutes,
        trx_query
 FROM information_schema.INNODB_TRX
 WHERE TIMESTAMPDIFF(MINUTE, trx_started, NOW()) > 10;

4.3 Resolve Lock Waits

Force‑kill a blocking connection only after careful analysis:

# Show specific connections
 SHOW PROCESSLIST;

# Kill the thread holding the lock
 KILL <thread_id>;

# Kill all long‑running transactions (very cautious)
 SELECT CONCAT('KILL ', trx_mysql_thread_id, ';') FROM information_schema.INNODB_TRX WHERE TIMESTAMPDIFF(MINUTE, trx_started, NOW()) > 30 INTO OUTFILE '/tmp/kill_long_trx.sql';
 SOURCE /tmp/kill_long_trx.sql;

5. Configuration Checks

5.1 Buffer Pool Mis‑configuration

# View current buffer pool settings
 SHOW VARIABLES LIKE 'innodb_buffer_pool%';

# View buffer pool usage
 SHOW STATUS LIKE 'Innodb_buffer_pool%';

# Example: set buffer pool to 16 GB (≈70‑80% of RAM)
 SET GLOBAL innodb_buffer_pool_size = 17179869184;  -- 16GB

5.2 Thread and I/O Settings

# View thread cache size
 SHOW VARIABLES LIKE 'thread_cache_size';

# View thread creation count
 SHOW STATUS LIKE 'Threads_created';

# Set a reasonable thread cache
 SET GLOBAL thread_cache_size = 64;

# View InnoDB I/O thread counts
 SHOW VARIABLES LIKE 'innodb_write_io_threads';
 SHOW VARIABLES LIKE 'innodb_read_io_threads';

# Increase I/O threads for SSDs
 SET GLOBAL innodb_write_io_threads = 16;
 SET GLOBAL innodb_read_io_threads = 16;

5.3 Log Settings

# View log variables
 SHOW VARIABLES LIKE 'innodb_log%';

# Recommended log file size: 256M‑1G
 SHOW VARIABLES LIKE 'innodb_log_file_size';

# Flush policy (affects performance vs durability)
 SHOW VARIABLES LIKE 'innodb_flush_log_at_trx_commit';
 # 1 = safest, lowest performance
 # 2 = OS‑buffered, balanced
 # 0 = flush once per second, highest performance

6. Business‑Layer Analysis

6.1 Normal vs. Abnormal Traffic

# Check MySQL QPS
 mysqladmin ext -i 1 | grep Questions

# Check SELECT QPS
 mysqladmin ext -i 1 | grep "Com_select"

# Compare with historical data to spot abnormal spikes

If traffic growth is legitimate, consider architectural changes (read/write splitting, sharding). If abnormal, investigate possible attacks or crawlers.

6.2 Analyze Request Characteristics

# Most frequently queried tables
 SELECT OBJECT_SCHEMA, OBJECT_NAME, COUNT_STAR,
        SUM_TIMER_WAIT/1000000000000 AS total_seconds
 FROM performance_schema.objects_summary_global_by_type
 WHERE OBJECT_TYPE='TABLE'
 ORDER BY COUNT_STAR DESC
 LIMIT 20;

# Most used indexes
 SELECT OBJECT_SCHEMA, OBJECT_NAME, INDEX_NAME, COUNT_STAR,
        SUM_TIMER_WAIT/1000000000000 AS total_seconds
 FROM performance_schema.table_io_waits_summary_by_index_usage
 WHERE INDEX_NAME IS NOT NULL
 ORDER BY COUNT_STAR DESC
 LIMIT 20;

6.3 Traffic Source Analysis

# Show client IP distribution
 mysql -e "SHOW PROCESSLIST" | awk '{print $3}' | grep -v "localhost" | sort | uniq -c | sort -rn

# Review application‑side connection‑pool settings

7. Emergency Mitigations

7.1 Short‑Term Relief

# Switch to read‑only mode (if a replica exists)
 SET GLOBAL read_only = ON;
 SET GLOBAL super_read_only = ON;

# Kill top CPU‑consuming queries
 SELECT CONCAT('KILL QUERY ', ID, ';') FROM information_schema.processlist WHERE COMMAND!='Sleep' ORDER BY TIME DESC LIMIT 5;

7.2 Connection Throttling

# Limit total connections (MySQL 8.0+)
 SET GLOBAL max_connections = 500;

# Per‑user limits must be enforced at the application layer or via ProxySQL/MySQL Router

7.3 Query Throttling

# Temporarily raise slow‑query threshold
 SET GLOBAL long_query_time = 0.5;

# Use pt‑query‑digest or Percona Server to reject specific patterns

8. Root‑Cause Solutions

8.1 Read/Write Splitting

# Check replica sync status
 SHOW SLAVE STATUS\G

# Deploy MySQL Router or ProxySQL for read/write routing

8.2 Sharding

# Example: time‑based partitioning for orders
 -- orders_202401, orders_202402, ...

# Or use middleware such as Vitess, TiDB, ProxySQL

8.3 Caching Layer

# Cache hot data in Redis
 # Session cache
 # Config cache
 # Aggregated statistics cache

9. Preventive Measures

9.1 Monitoring & Alerting

# Prometheus + mysqld_exporter
 # Key metrics:
 # - mysql_global_status_threads_connected
 # - mysql_global_status_threads_running
 # - rate(mysql_global_status_questions[5m])
 # - rate(mysql_global_status_slow_queries[5m])

9.2 Regular Slow‑Query Review

# Daily slow‑query analysis
 pt-query-digest --since '24h' /var/log/mysql/slow.log > /tmp/slow_query_report.txt

9.3 Capacity Planning

# Monitor CPU & connection trends
 # Plan hardware upgrades or architectural scaling ahead of demand

10. Conclusion

MySQL CPU spikes require a holistic analysis. Operators should first verify that MySQL itself is the CPU consumer, then examine internal metrics—connection counts, running queries, lock waits, and configuration parameters. After pinpointing the root cause, apply short‑term mitigations and long‑term architectural improvements such as read/write splitting, sharding, or caching. Robust monitoring, slow‑query governance, and capacity planning are essential to prevent recurrence.

References:

MySQL 8.0 Reference Manual – InnoDB Performance

MySQL 8.0 Reference Manual – Performance Schema

Percona Blog – MySQL CPU Performance

SHOW ENGINE INNODB STATUS documentation

MySQL connection management and optimization guide

Performance TuningMySQLTroubleshootingCPUDatabase Operations
Ops Community
Written by

Ops Community

A leading IT operations community where professionals share and grow together.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.