Databases 8 min read

How to Tame MySQL CPU Spikes: A Complete 4‑Step Emergency Guide

When MySQL CPU usage spikes to 500%, this guide walks you through a four‑step emergency process—quickly stopping the overload, diagnosing the root cause, applying targeted SQL and configuration optimizations, and setting up monitoring to prevent future spikes—ensuring service stability and performance.

Ray's Galactic Tech
Ray's Galactic Tech
Ray's Galactic Tech
How to Tame MySQL CPU Spikes: A Complete 4‑Step Emergency Guide

MySQL CPU Spike Scenario

When MySQL CPU usage jumps to 500%, the database is under extreme pressure and applications may experience timeouts or blocking. The handling principle is "stop loss first, then cure" and can be summarized as a four‑step strategy: emergency stop‑bleeding, root‑cause investigation, targeted optimization, and preventive measures.

Step 1 – Emergency Stop‑Bleeding

Goal: Reduce CPU usage and restore service availability.

Locate high‑CPU processes

Use top -Hp $(pidof mysqld) or htop to confirm that mysqld is consuming a lot of CPU.

Identify and kill problematic sessions

SHOW FULL PROCESSLIST;
KILL <process_id>;

Prioritize sessions with a long Time value, abnormal State (e.g., Sending data, Copying to tmp table, Sorting result, locked), or complex queries shown in the Info column.

Killing the top few resource‑hungry sessions usually brings CPU down dramatically.

Emergency scaling (cloud environments)

Temporarily increase instance CPU/IOPS to relieve pressure.

After the issue is resolved, downscale to control costs.

Step 2 – Deep Investigation and Root‑Cause Identification

Goal: Find the underlying cause of the CPU spike to prevent recurrence.

Check slow‑query log settings

SHOW VARIABLES LIKE 'slow_query_log%';
SHOW VARIABLES LIKE 'long_query_time';

Analyze slow queries

mysqldumpslow -s r -t 10 /path/to/slow.log   # most frequent queries
mysqldumpslow -s c -t 10 /path/to/slow.log   # most rows examined
mysqldumpslow -s t -t 10 -g "LEFT JOIN" /path/to/slow.log   # pattern search

It is recommended to use Percona Toolkit for deeper analysis:

pt-query-digest /path/to/slow.log > slow_report.txt

Real‑time diagnostics

SHOW FULL PROCESSLIST;
SHOW GLOBAL STATUS LIKE 'Handler%';
SHOW GLOBAL STATUS LIKE 'Threads_running';
SHOW GLOBAL STATUS LIKE 'Sort%';
SHOW GLOBAL STATUS LIKE 'Innodb_rows_read%';

Pay attention to thread count, sorting, temporary tables, and read/write counters.

Determine load type

CPU‑intensive: Sorting, grouping, or join queries that may cause full‑table scans.

IO‑intensive: Insufficient memory leading to frequent disk reads/writes (watch %iowait).

Step 3 – Targeted Optimization

Goal: Eliminate the performance bottleneck.

1. SQL Optimization

Add indexes on columns used in WHERE, ORDER BY, GROUP BY, and JOIN clauses.

Avoid SELECT * in production queries.

Break complex joins or sub‑queries into simpler statements.

Regularly review slow‑query logs to prevent problematic SQL from reaching production.

2. Database Configuration Tuning

InnoDB buffer pool SHOW VARIABLES LIKE 'innodb_buffer_pool_size'; Set to 50‑70% of available memory.

Temporary table and sort buffers

tmp_table_size
max_heap_table_size
sort_buffer_size

Avoid excessive on‑disk temporary tables to reduce CPU and I/O pressure.

3. Architecture & Business Optimizations

Cache hot data with Redis or Memcached.

Implement read/write splitting; run heavy reporting queries on read‑only replicas.

Apply business‑level throttling to limit non‑core request traffic.

Archive historical data to shrink table size and improve query efficiency.

Step 4 – Preventive Mechanisms

Goal: Avoid future CPU spikes.

Monitoring & Alerts

Metrics: CPU usage, Threads_running, slow‑query count, TPS/QPS.

Tools: Prometheus + Grafana with appropriate alert rules.

SQL Review Conduct performance reviews before deployment using tools such as Archery or Yearning.

Stress Testing Simulate high‑concurrency loads before major releases or promotional events.

Practical Tips

Run EXPLAIN regularly to inspect execution plans of hot queries.

Enable log_queries_not_using_indexes to spot queries that bypass indexes.

Use performance_schema for more precise real‑time analysis than SHOW PROCESSLIST.

In cloud environments, combine read/write splitting with elastic scaling for rapid pressure relief.

Adjust connection pool size and thread count in high‑concurrency scenarios to avoid sudden CPU peaks.

Conclusion

CPU spikes are symptoms, not root causes. Follow the four‑step "emergency" workflow: stop the bleed to restore service, investigate to locate the high‑CPU SQL or operation, cure the root cause with SQL, configuration, and architectural tweaks, and finally set up monitoring, review, and stress‑testing to prevent recurrence.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

CPU optimizationmysqlDatabase MonitoringSlow query analysis
Ray's Galactic Tech
Written by

Ray's Galactic Tech

Practice together, never alone. We cover programming languages, development tools, learning methods, and pitfall notes. We simplify complex topics, guiding you from beginner to advanced. Weekly practical content—let's grow together!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.