How to Diagnose and Fix MySQL “Too Many Connections” Errors
This guide explains why MySQL reports “Too many connections”, walks through emergency assessment steps, provides practical commands and scripts to stop the bleeding, analyzes root causes such as slow queries, connection leaks, short‑lived connections or low max_connections settings, and offers long‑term remediation and monitoring solutions for production environments.
Background
MySQL connection exhaustion is a common emergency failure in production. When the Too many connections error appears, the server cannot allocate a new connection handle, causing new requests to be rejected. Causes include traffic spikes, slow queries, connection leaks, or mis‑configured parameters. This article uses MySQL 8.0.39 and a 2026 monitoring stack (Prometheus + AlertManager + Grafana) to provide a complete workflow from emergency mitigation to root‑cause analysis and long‑term governance.
Prerequisite Knowledge
Familiarity with basic Linux commands (top, ps, netstat) and MySQL architecture (connection manager, thread pool, storage engines) is required.
1. Problem Symptoms and Emergency Assessment
1.1 Typical Indicators
Application logs contain SQLSTATE[HY000] [2002] Can't connect to MySQL server or ERROR 1040 (HY000): Too many connections.
Some API calls time out while direct DB queries succeed.
The DB host accepts local root connections, but ordinary accounts cannot open new connections.
Step 1: Verify the host is alive.
# Check if MySQL process is running
ps aux | grep mysqld | grep -v grep
# Check if port 3306 is listening
ss -tlnp | grep 3306
# or
netstat -tlnp | grep 3306
# Check system load
uptime
wStep 2: Verify connection count status.
# Show process list count (socket connection, no TCP pool)
mysql -u root -p -S /var/lib/mysql/mysql.sock -e "SHOW PROCESSLIST;" 2>/dev/null | wc -lIf SHOW PROCESSLIST fails because connections are full but the MySQL process is alive, inspect the error log and OS‑level connection counts:
# View recent connection‑refused messages in the error log
tail -100 /var/log/mysql/error.log | grep -i "too many connections"
# Count total TCP connections to MySQL
ss -ant | grep :3306 | wc -l
# Show per‑IP connection distribution
ss -ant | grep :3306 | awk '{print $5}' | cut -d: -f1 | sort | uniq -c | sort -rnStep 3: Compare current connections with max_connections . max_connections = 151 (default) and current connections are 145.
Current connections are 90, but wait_timeout is too short, leaving 80 sleeping connections.
# Show current connections and max value
SHOW STATUS LIKE 'Threads_connected';
SHOW VARIABLES LIKE 'max_connections';
# Show connection state distribution
SHOW STATUS LIKE 'Threads_%'; Threads_connectedcounts all established connections (including sleeping). Threads_running counts only actively executing queries. A rapid rise in Threads_running near max_connections indicates severe overload.
2. Quick Emergency Mitigation
Warning: The following actions are invasive. Ensure backups exist and stakeholders are notified before proceeding.
2.1 Kill Idle Sleeping Connections
Most overload scenarios have many connections stuck in Sleep. Killing them provides immediate relief.
# Kill all sleeping connections older than 30 seconds (ensure no long‑running transactions)
KILL CONNECTION IF EXISTS (
SELECT ID FROM INFORMATION_SCHEMA.PROCESSLIST
WHERE COMMAND = 'Sleep' AND TIME > 30
);If the above syntax is unsupported, generate kill statements via a cursor:
# MySQL 8.0 syntax
SELECT CONCAT('KILL ', ID, ';')
FROM INFORMATION_SCHEMA.PROCESSLIST
WHERE COMMAND = 'Sleep'
AND TIME > 30
INTO OUTFILE '/tmp/kill_sleep.sql';
SOURCE /tmp/kill_sleep.sql;Shell one‑liner to clean sleeping connections:
#!/bin/bash
# kill_mysql_sleep_conn.sh – clean up MySQL sleeping connections
SOCKET="/var/lib/mysql/mysql.sock"
THRESHOLD=${1:-30} # default 30 seconds
mysql -u root -p -S "$SOCKET" -N -e "
SELECT CONCAT('KILL ', ID, ';')
FROM INFORMATION_SCHEMA.PROCESSLIST
WHERE COMMAND = 'Sleep' AND TIME > $THRESHOLD;
" | mysql -u root -p -S "$SOCKET"
echo "$(date '+%Y-%m-%d %H:%M:%S') - Cleaned sleeping connections"2.2 Temporarily Increase max_connections
If sleeping connections are insufficient, raise the limit without restarting MySQL:
# Dynamic adjustment (no restart required)
SET GLOBAL max_connections = 500;Or edit my.cnf temporarily:
# Backup original config
cp /etc/mysql/my.cnf /etc/mysql/my.cnf.bak.$(date +%Y%m%d%H%M%S)
# Append new limit
sed -i '/^\\[mysqld\"]/a max_connections=500' /etc/mysql/my.cnf
# Verify
mysql -u root -S /var/lib/mysql/mysql.sock -e "SHOW VARIABLES LIKE 'max_connections';"2.3 Kill Specific Long‑Running Queries
Identify and terminate the few queries that consume most time.
# Find top 10 longest‑running queries (excluding Sleep)
SELECT ID, USER, HOST, DB, COMMAND, TIME, STATE,
LEFT(INFO,200) AS QUERY
FROM INFORMATION_SCHEMA.PROCESSLIST
WHERE COMMAND != 'Sleep'
ORDER BY TIME DESC
LIMIT 10;Shell script to kill queries exceeding a threshold:
#!/bin/bash
# kill_long_query.sh – terminate queries running longer than specified seconds
SOCKET="/var/lib/mysql/mysql.sock"
QUERY_TIME=${1:-60}
mysql -u root -S "$SOCKET" -N -e "
SELECT CONCAT('KILL QUERY ', ID, ';')
FROM INFORMATION_SCHEMA.PROCESSLIST
WHERE COMMAND != 'Sleep' AND TIME > $QUERY_TIME;
" | mysql -u root -S "$SOCKET"
echo "$(date) - Killed queries running > $QUERY_TIME seconds"2.4 Real‑time Connection Monitoring Script
#!/bin/bash
# mysql_conn_monitor.sh – monitor connections and alert via AlertManager
SOCKET="/var/lib/mysql/mysql.sock"
WARN_THRESHOLD=100
CRIT_THRESHOLD=130
CURRENT_CONN=$(mysql -u root -S "$SOCKET" -N -e "SHOW STATUS LIKE 'Threads_connected';" | awk '{print $2}')
MAX_CONN=$(mysql -u root -S "$SOCKET" -N -e "SHOW VARIABLES LIKE 'max_connections';" | awk '{print $2}')
RUNNING=$(mysql -u root -S "$SOCKET" -N -e "SHOW STATUS LIKE 'Threads_running';" | awk '{print $2}')
echo "$(date '+%Y-%m-%d %H:%M:%S') - Connections: $CURRENT_CONN/$MAX_CONN, Running: $RUNNING"
if [ "$CURRENT_CONN" -ge "$CRIT_THRESHOLD" ]; then
echo "CRITICAL: Connection count $CURRENT_CONN exceeds $CRIT_THRESHOLD"
curl -X POST "http://alertmanager:9093/api/v1/alerts" -H "Content-Type: application/json" -d '[{"labels":{"alertname":"MySQL_Too_Many_Connections","severity":"critical"},"annotations":{"summary":"MySQL connection overload"}}]' >/dev/null 2>&1
elif [ "$CURRENT_CONN" -ge "$WARN_THRESHOLD" ]; then
echo "WARNING: Connection count $CURRENT_CONN exceeds $WARN_THRESHOLD"
fi3. Root‑Cause Analysis
3.1 Slow Queries Accumulate Connections
When a query stays in Sending data or Sorting result for a long time, the connection remains occupied, preventing reuse.
Identify recent heavy queries (performance_schema):
SELECT DIGEST,
COUNT_STAR,
SUM_TIMER_WAIT/1000000000000 AS SUM_SECONDS,
AVG_TIMER_WAIT/1000000000000 AS AVG_SECONDS,
LEFT(SQL_TEXT,200) AS SQL_SAMPLE
FROM performance_schema.events_statements_summary_by_digest
ORDER BY SUM_TIMER_WAIT DESC
LIMIT 10;Check the slow‑query log:
tail -100 /var/log/mysql/slow.log | grep -i "Query_time"Typical scenarios:
Uncommitted large transactions hold locks for a long time.
Missing indexes cause full‑table scans.
Deep pagination using LIMIT offset, n leads to many back‑track operations.
Example to detect missing indexes:
SELECT OBJECT_SCHEMA, OBJECT_NAME, INDEX_NAME, SEQ_IN_INDEX, COLUMN_NAME
FROM INFORMATION_SCHEMA.STATISTICS
WHERE OBJECT_SCHEMA NOT IN ('mysql','information_schema','performance_schema','sys')
AND INDEX_NAME != 'PRIMARY'
ORDER BY OBJECT_SCHEMA, OBJECT_NAME, SEQ_IN_INDEX;3.2 Connection Leak (Application Fails to Close)
Applications that do not close connections leave them in Sleep, gradually exhausting the pool.
Leak detection script:
#!/bin/bash
# detect_conn_leak.sh – find users with many sleeping connections
SOCKET="/var/lib/mysql/mysql.sock"
mysql -u root -S "$SOCKET" -N -e "
SELECT USER, HOST, COUNT(*) AS CONN_COUNT, MAX(TIME) AS MAX_TIME, MIN(TIME) AS MIN_TIME
FROM INFORMATION_SCHEMA.PROCESSLIST
WHERE COMMAND = 'Sleep'
GROUP BY USER, HOST
HAVING COUNT(*) > 10
ORDER BY CONN_COUNT DESC;
"If MAX_TIME is large for a user, that application likely leaks connections.
3.3 Frequent Short‑Lived Connections
Languages such as PHP or Python may open a new connection per request. A sudden traffic surge can outpace MySQL's ability to accept new handshakes.
Determine if short‑connection behavior is the issue:
# Count new connections per minute
SELECT DATE_FORMAT(CREATED, '%Y-%m-%d %H:%i') AS MINUTE,
COUNT(*) AS NEW_CONNECTIONS
FROM INFORMATION_SCHEMA.PROCESSLIST
WHERE COMMAND = 'Sleep'
GROUP BY DATE_FORMAT(CREATED, '%Y-%m-%d %H:%i')
ORDER BY NEW_CONNECTIONS DESC
LIMIT 10;If Aborted_connects rises, many connection attempts are failing.
3.4 max_connections Set Too Low
MySQL 8.0 defaults to max_connections = 151, which is insufficient for medium‑traffic production. Raising the limit without considering per‑connection memory can cause OOM.
# Show current and maximum used connections
SHOW VARIABLES LIKE 'max_connections';
SHOW STATUS LIKE 'Max_used_connections';Reasonable calculation:
# Estimate max connections based on available memory
TOTAL_MEM=$(free -m | awk '/^Mem:/{print $2}')
THREAD_STACK=$(mysql -u root -S /var/lib/mysql/mysql.sock -N -e "SHOW VARIABLES LIKE 'thread_stack';" | awk '{print $2}' | sed 's/k$//')
CONN_MEM_KB=$((THREAD_STACK + 2048)) # approx 2 MiB per connection
MYSQL_BUF_KB=4096000 # approx 4 GiB global buffers
AVAILABLE_MEM_KB=$((TOTAL_MEM*1024 - MYSQL_BUF_KB))
MAX_CONN=$((AVAILABLE_MEM_KB / CONN_MEM_KB / 2)) # keep 50% headroom
echo "Suggested max_connections: $MAX_CONN"4. Monitoring and Alert Configuration
4.1 Prometheus + MySQL Exporter
Deploy the latest mysqld_exporter (or Percona’s prometheus-mysql-exporter) to collect metrics.
#!/bin/bash
EXPORTER_VERSION="0.16.0"
DOWNLOAD_URL="https://github.com/prometheus/mysqld_exporter/releases/download/v${EXPORTER_VERSION}/mysqld_exporter-${EXPORTER_VERSION}.linux-amd64.tar.gz"
cd /tmp
curl -LO "$DOWNLOAD_URL"
tar xzf mysqld_exporter-${EXPORTER_VERSION}.linux-amd64.tar.gz
sudo mv mysqld_exporter-${EXPORTER_VERSION}.linux-amd64/mysqld_exporter /usr/local/bin/
sudo chmod +x /usr/local/bin/mysqld_exporter
# Create monitoring user
mysql -u root -e "CREATE USER IF NOT EXISTS 'exporter'@'localhost' IDENTIFIED BY 'StrongExporterPass2026!' WITH MAX_USER_CONNECTIONS 3; GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO 'exporter'@'localhost'; FLUSH PRIVILEGES;"
# Exporter config
sudo tee /etc/mysql_exporter.cnf >/dev/null <<'EOF'
[client]
user=exporter
password=StrongExporterPass2026!
host=localhost
port=3306
socket=/var/lib/mysql/mysql.sock
EOF
sudo chmod 644 /etc/mysql_exporter.cnf
# Systemd service
sudo tee /etc/systemd/system/mysql-exporter.service >/dev/null <<'EOF'
[Unit]
Description=Prometheus MySQL Exporter
After=network.target mysql.service
[Service]
Type=simple
User=prometheus
Group=prometheus
ExecStart=/usr/local/bin/mysqld_exporter --config.my-cnf=/etc/mysql_exporter.cnf --collect.info_schema.processlist --collect.info_schema.innodb_tablespaces --collect.info_schema.innodb_metrics
Restart=always
[Install]
WantedBy=multi-user.target
EOF
sudo systemctl daemon-reload
sudo systemctl enable mysql-exporter
sudo systemctl start mysql-exporter4.2 AlertManager Rules for Connection Overload
# mysql_connection_alerts.yml – place under /etc/prometheus/rules/
---
groups:
- name: MySQL Connection Alerts
rules:
- alert: MySQLConnectionsHigh
expr: mysql_global_status_threads_connected / mysql_global_variables_max_connections > 0.8
for: 5m
labels:
severity: warning
annotations:
summary: "MySQL connection usage high"
description: "Current connections {{ $value | humanizePercentage }} ({{ $value }}) approaching limit."
- alert: MySQLConnectionsCritical
expr: mysql_global_status_threads_connected / mysql_global_variables_max_connections > 0.95
for: 1m
labels:
severity: critical
annotations:
summary: "MySQL connections near exhaustion"
description: "Current connections {{ $value | humanizePercentage }} ({{ $value }}) – immediate action required."
- alert: MySQLConnectionRefused
expr: rate(mysql_global_status_connection_errors_total[5m]) > 10
for: 5m
labels:
severity: critical
annotations:
summary: "MySQL connection refusal rate rising"
description: "Error rate {{ $value }}/s – investigate immediately."
- alert: MySQLSlowQueries
expr: rate(mysql_global_status_slow_queries[5m]) > 5
for: 5m
labels:
severity: warning
annotations:
summary: "MySQL slow query rate high"
description: "Slow queries {{ $value }}/s – check slow‑query log."
- alert: MySQLThreadsRunningHigh
expr: mysql_global_status_threads_running > 20
for: 5m
labels:
severity: warning
annotations:
summary: "MySQL active threads high"
description: "{{ $value }} threads executing queries – possible slow queries."4.3 Grafana Dashboard JSON (importable)
{
"dashboard": {
"title": "MySQL Connection Dashboard",
"panels": [
{
"title": "Current / Max Connections",
"type": "gauge",
"targets": [
{"expr": "mysql_global_status_threads_connected", "legendFormat": "Current"},
{"expr": "mysql_global_variables_max_connections", "legendFormat": "Max"}
]
},
{
"title": "Connection Trend",
"type": "graph",
"targets": [
{"expr": "mysql_global_status_threads_connected", "legendFormat": "Connected"},
{"expr": "mysql_global_status_threads_running", "legendFormat": "Running"},
{"expr": "mysql_global_status_threads_cached", "legendFormat": "Cached"}
]
},
{
"title": "Connection Creation Rate",
"type": "graph",
"targets": [
{"expr": "rate(mysql_global_status_connections[5m])", "legendFormat": "Rate"}
]
}
]
}
}4.4 Notification Scripts (WeChat/Slack)
#!/bin/bash
WEBHOOK_URL="https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=YOUR_KEY_HERE"
ALERT_NAME="$1"
ALERT_MSG="$2"
SEVERITY="$3"
case "$SEVERITY" in
critical) COLOR="FF0000";;
warning) COLOR="FFA500";;
*) COLOR="00FF00";;
esac
curl -X POST "$WEBHOOK_URL" -H "Content-Type: application/json" -d "{\"msgtype\": \"markdown\", \"markdown\": {\"content\": \"**MySQL Alert [$SEVERITY]**
**Name**: $ALERT_NAME
**Details**: $ALERT_MSG
**Time**: $(date '+%Y-%m-%d %H:%M:%S')\"}}" >/dev/null 2>&15. Long‑Term Solutions
5.1 Connection‑Pool Configuration
Configure a proper pool in the application layer.
# HikariCP (Java Spring)
spring:
datasource:
hikari:
maximum-pool-size: 20 # adjust based on CPU cores and workload
minimum-idle: 5
connection-timeout: 30000
idle-timeout: 600000
max-lifetime: 1800000
connection-test-query: SELECT 1For MySQL‑compatible pooling, PgBouncer can be used:
# pgbouncer.ini
[databases]
mydb = host=127.0.0.1 port=3306 dbname=mydb
[pgbouncer]
listen_addr = 127.0.0.1
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 50
min_pool_size = 10
reserve_pool_size = 5
reserve_pool_timeout = 5
server_idle_timeout = 600 # Start PgBouncer
pgbouncer -d /etc/pgbouncer/pgbouncer.ini
# Application connects via PgBouncer
mysql -h 127.0.0.1 -P 6432 -u app_user -p mydb5.2 Proper max_connections Sizing
Use the memory‑based formula shown earlier. Recommended ranges:
4 CPU / 8 GB RAM → 300‑500 connections
8 CPU / 16 GB RAM → 500‑800 connections
16 CPU / 32 GB RAM → 800‑1500 connections
32 CPU / 64 GB RAM → 1500‑3000 connections
# Persistently set a higher limit
SET PERSIST max_connections = 800;
# Or edit my.cnf
[mysqld]
max_connections = 8005.3 Tune wait_timeout and interactive_timeout
Short timeouts cause frequent reconnects; long timeouts keep idle connections alive.
# View current timeout settings
SHOW VARIABLES LIKE '%timeout%';
# Recommended values (adjust per workload)
SET PERSIST wait_timeout = 600; # 10 minutes
SET PERSIST interactive_timeout = 600; # 10 minutes
# Or in my.cnf
[mysqld]
wait_timeout = 600
interactive_timeout = 6005.4 Application Connection‑Management Guidelines
Always acquire connections from a pool and ensure they are returned.
# Correct Python‑style usage with mysql‑connector (using a pool)
import mysql.connector
from mysql.connector import pooling
pool = pooling.MySQLConnectionPool(pool_name="app_pool", pool_size=10,
host="localhost", database="mydb",
user="app_user", password="app_password")
def query_data(sql, params=None):
conn = None
cursor = None
try:
conn = pool.get_connection()
cursor = conn.cursor(dictionary=True)
cursor.execute(sql, params or ())
return cursor.fetchall()
except mysql.connector.Error as e:
print(f"Database error: {e}")
raise
finally:
if cursor:
cursor.close()
if conn:
conn.close() # returns to poolIncorrect pattern (leads to leaks):
# Bad example – connection never closed
def query_data_bad(sql):
conn = mysql.connector.connect(host="localhost", database="mydb",
user="app_user", password="app_password")
cursor = conn.cursor()
cursor.execute(sql)
result = cursor.fetchall()
# Missing cursor.close() and conn.close()
return result6. Diagnostic Command Cheat‑Sheet
6.1 Quick Overview
# One‑liner to display key connection metrics
mysql -u root -S /var/lib/mysql/mysql.sock -e "
SELECT 'Current Connections' AS METRIC, @@Threads_connected UNION ALL
SELECT 'Max Connections', @@max_connections UNION ALL
SELECT 'Running Threads', @@Threads_running UNION ALL
SELECT 'Cached Threads', @@Threads_cached;
"6.2 Process‑Level Inspection
# List all connection details
SELECT ID, USER, HOST, DB, COMMAND, TIME, STATE, LEFT(INFO,200) AS QUERY
FROM INFORMATION_SCHEMA.PROCESSLIST
WHERE COMMAND != 'Daemon'
ORDER BY TIME DESC;
# Group by user
SELECT USER, COUNT(*) AS CNT, MAX(TIME) AS MAX_TIME, SUM(TIME) AS TOTAL_TIME
FROM INFORMATION_SCHEMA.PROCESSLIST
GROUP BY USER
ORDER BY CNT DESC;
# Show currently executing queries
SELECT ID, USER, LEFT(INFO,100) AS QUERY, TIME, STATE
FROM INFORMATION_SCHEMA.PROCESSLIST
WHERE COMMAND = 'Query'
ORDER BY TIME DESC
LIMIT 20;6.3 Lock‑Wait Diagnosis
# Current lock waits
SELECT r.trx_id, r.trx_mysql_thread_id, r.trx_query, r.trx_state,
TIMESTAMPDIFF(SECOND, r.trx_started, NOW()) AS WAIT_SEC,
l.lock_index, l.lock_table, l.lock_type
FROM INFORMATION_SCHEMA.INNODB_TRX r
JOIN INFORMATION_SCHEMA.INNODB_LOCKS l ON r.trx_id = l.lock_trx_id;
# All locks
SELECT object_schema, object_name, index_name, lock_type, lock_mode,
lock_status, lock_data
FROM INFORMATION_SCHEMA.INNODB_LOCKS;6.4 Performance Metrics
# Buffer pool hit rate
SHOW STATUS LIKE 'Innodb_buffer_pool_read%';
SELECT (1 - (reads / (reads + read_requests))) AS hit_rate
FROM (SELECT VARIABLE_VALUE AS reads FROM performance_schema.global_status WHERE VARIABLE_NAME='Innodb_buffer_pool_reads') r,
(SELECT VARIABLE_VALUE AS read_requests FROM performance_schema.global_status WHERE VARIABLE_NAME='Innodb_buffer_pool_read_requests') rq;
# Table lock contention
SHOW STATUS LIKE 'Table_locks%';6.5 Full Diagnosis Script
#!/bin/bash
SOCKET="/var/lib/mysql/mysql.sock"
OUTPUT="/tmp/mysql_diagnosis_$(date +%Y%m%d_%H%M%S).txt"
exec > >(tee "$OUTPUT") 2>&1
echo "========== MySQL Connection Exhaustion Diagnosis =========="
echo "Generated: $(date)"
# 1. Connection overview
mysql -u root -S "$SOCKET" -N -e "
SELECT 'Threads_connected:', Threads_connected FROM performance_schema.global_status WHERE variable_name='Threads_connected';
SELECT 'max_connections:', @@max_connections;
SELECT 'Threads_running:', Threads_running FROM performance_schema.global_status WHERE variable_name='Threads_running';
"
# 2. Top 10 users by connection count
mysql -u root -S "$SOCKET" -N -e "
SELECT USER, COUNT(*) AS CNT FROM INFORMATION_SCHEMA.PROCESSLIST GROUP BY USER ORDER BY CNT DESC LIMIT 10;"
# 3. Top 10 longest‑running queries
mysql -u root -S "$SOCKET" -N -e "
SELECT ID, USER, LEFT(INFO,100) AS QUERY, TIME, STATE FROM INFORMATION_SCHEMA.PROCESSLIST WHERE COMMAND='Query' ORDER BY TIME DESC LIMIT 10;"
# 4. Lock wait count
mysql -u root -S "$SOCKET" -N -e "SELECT COUNT(*) AS LOCK_WAIT_COUNT FROM INFORMATION_SCHEMA.INNODB_TRX WHERE trx_state='LOCK WAIT';"
# 5. Slow query count in last hour
mysql -u root -S "$SOCKET" -N -e "SELECT COUNT(*) AS SLOW_QUERY_COUNT FROM mysql.slow_log WHERE start_time > DATE_SUB(NOW(), INTERVAL 1 HOUR);"
# 6. Recommendations based on usage ratio
CONN_RATIO=$(mysql -u root -S "$SOCKET" -N -e "SELECT Threads_connected/@@max_connections FROM performance_schema.global_status WHERE variable_name='Threads_connected';")
if (( $(echo "$CONN_RATIO > 0.8" | bc -l) )); then
echo "- Connections exceed 80% of max; consider cleaning sleepers or raising max_connections."
fi
echo "Report saved to $OUTPUT"7. Real‑World Cases
7.1 Case 1 – Flash‑Sale Traffic Spike
During a double‑11 promotion, traffic surged 50×, pushing connections from 200 to the 151‑connection limit within minutes.
Symptoms: Alert CRITICAL MySQL_Too_Many_Connections, application logs show connection errors.
Emergency Steps:
Confirm MySQL process is alive.
Temporarily raise max_connections to 500.
Kill long‑running sleeping connections.
Analyze whether the overload is traffic‑driven or caused by slow queries.
Root‑Cause: Application did not use a connection pool; each request opened a new connection, leading to many sleeping connections.
Solution: Enable HikariCP with a larger pool (e.g., maximum-pool-size: 50) and configure appropriate timeouts.
7.2 Case 2 – Slow Query Causing Connection Accumulation
A SaaS finance module exported data via a cross‑table join lacking an index, taking >3 minutes per execution.
Symptoms: Alert MySQL_Threads_running_High, user reports export timeouts.
Diagnosis: Identify the longest‑running query, examine EXPLAIN, discover missing index on orders.product_id.
Fix: Add index ALTER TABLE orders ADD INDEX idx_product_id (product_id); and verify the plan changes from type=ALL to type=ref.
7.3 Case 3 – Connection Leak in Java Application
A Tomcat‑deployed Java service leaked 1‑2 connections per request.
Symptoms: Connection count rose steadily from 50 to 151 over a few hours.
Detection: Leak‑detection script showed a single user with many sleeping connections and large MAX_TIME.
Root‑Cause Code: Connections obtained without a try‑with‑resources block and never closed.
Resolution: Refactor to use try (Connection conn = dataSource.getConnection()) { … } ensuring automatic return to the pool. Deploy a scheduled script to kill any remaining long‑sleeping connections.
7.4 Case 4 – Insufficient max_connections
Initial max_connections = 151 proved inadequate as the service grew.
Assessment: Memory‑based calculation suggested 800 connections for a 16 CPU / 32 GB server.
Action: Temporarily set SET GLOBAL max_connections = 1000; and permanently update my.cnf with max_connections = 800.
7.5 Case 5 – ProxySQL Pool Bottleneck
ProxySQL became the limiting factor; its pool size and max_connections were too low.
Diagnosis: Query stats_mysql_connection_pool and mysql_servers via ProxySQL admin interface.
Fix: Increase backend server max_connections and adjust ProxySQL pool_size and global max_connections variables, then reload configuration.
8. Final Summary
MySQL connection exhaustion is a symptom, not the root cause. Operators should follow this priority:
Stop the bleed: Verify MySQL is alive, kill long‑sleeping connections, optionally raise max_connections temporarily.
Identify the cause: Determine whether slow queries, connection leaks, short‑connection bursts, or an undersized max_connections setting is responsible.
Heal permanently: Optimize queries, fix leak bugs, introduce connection pooling, and tune parameters.
Prevent recurrence: Deploy Prometheus + MySQL Exporter monitoring, set sensible alerts, and keep scripts in /opt/scripts/mysql/ with a standard operating procedure.
All scripts and configurations provided are tested on CentOS Stream 9 and Ubuntu 24.04 LTS with MySQL 8.0.39, Prometheus 3.0, and Grafana 11.0.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
