Master MySQL in Production: From Configuration Tuning to SQL Performance Optimization
This comprehensive guide walks you through a real‑world MySQL outage, then details step‑by‑step configuration tweaks, InnoDB parameter tuning, connection and thread settings, index design, query rewrites, monitoring scripts, backup strategies, high‑availability replication, and essential tooling to keep your database fast and reliable.
Introduction: A Real Production Incident
Last month a core business system suffered massive timeouts; MySQL CPU spiked to 99% and slow queries piled up. After aggressive configuration adjustments and SQL tuning, the service recovered within 30 minutes, highlighting the critical role of MySQL optimization for operations engineers.
Part 1 – MySQL Configuration Tuning
1.1 Memory Settings
# my.cnf core memory settings
[mysqld]
# Buffer pool size: usually 70‑80% of physical RAM
innodb_buffer_pool_size = 8G
# Number of buffer pool instances to improve concurrency
innodb_buffer_pool_instances = 8
# Log buffer to reduce disk I/O
innodb_log_buffer_size = 64M
# Query cache (removed in MySQL 8.0 but shown for older versions)
query_cache_size = 256M
query_cache_type = 1Practical tip: Determine the optimal innodb_buffer_pool_size by checking the buffer‑pool hit rate; aim for >99%.
1.2 Connection and Thread Optimization
# Connection‑related settings
max_connections = 2000
max_connect_errors = 10000
connect_timeout = 60
wait_timeout = 28800
interactive_timeout = 28800
# Thread cache settings
thread_cache_size = 64
thread_concurrency = 16 max_connectionsshould be sized to server capacity, not set arbitrarily high.
Monitor Threads_connected and Threads_running to avoid sudden connection spikes.
1.3 InnoDB Core Parameters
# InnoDB core configuration
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit = 2
innodb_log_file_size = 1G
innodb_log_files_in_group = 2
innodb_max_dirty_pages_pct = 75
innodb_io_capacity = 2000
innodb_read_io_threads = 8
innodb_write_io_threads = 8Performance gain example: Changing innodb_flush_log_at_trx_commit from 1 to 2 increased TPS by ~40% (trade‑off: slightly lower durability).
Part 2 – SQL Performance Optimization
2.1 Index Optimization
Golden rule for composite indexes: Build them according to query patterns, not per column.
-- Bad: single‑column indexes
CREATE INDEX idx_user_id ON orders(user_id);
CREATE INDEX idx_status ON orders(status);
CREATE INDEX idx_create_time ON orders(create_time);
-- Good: composite index matching the WHERE clause
CREATE INDEX idx_user_status_time ON orders(user_id, status, create_time);2.2 Index Usage Analysis
SELECT object_schema, object_name, index_name, count_star, count_read,
count_insert, count_update, count_delete
FROM performance_schema.table_io_waits_summary_by_index_usage
WHERE index_name IS NOT NULL
AND count_star = 0
AND object_schema = 'your_database'
ORDER BY object_schema, object_name;2.3 Query Optimization Cases
Case 1 – Pagination on a table with millions of rows
-- Poor pagination (slow)
SELECT * FROM user_logs
WHERE user_id = 12345
ORDER BY created_at DESC
LIMIT 1000000, 20;
-- Optimized pagination (≈100× faster)
SELECT * FROM user_logs
WHERE user_id = 12345
AND id < (
SELECT id FROM user_logs
WHERE user_id = 12345
ORDER BY created_at DESC
LIMIT 1000000, 1
)
ORDER BY created_at DESC
LIMIT 20;Case 2 – Replacing a sub‑query with a JOIN
-- Slow sub‑query
SELECT * FROM orders o
WHERE o.user_id IN (SELECT u.id FROM users u WHERE u.level = 'VIP');
-- Faster JOIN
SELECT o.* FROM orders o
INNER JOIN users u ON o.user_id = u.id
WHERE u.level = 'VIP';2.4 Slow‑Query Log Analysis
# Enable slow‑query logging
slow_query_log = ON
slow_query_log_file = /var/log/mysql/slow.log
long_query_time = 1
log_queries_not_using_indexes = ONUse pt‑query‑digest from Percona Toolkit to generate a report:
# Install Percona Toolkit
sudo apt-get install percona-toolkit
# Analyze the slow‑query log
pt-query-digest /var/log/mysql/slow.log > slow_query_report.txtPart 3 – Advanced Tuning Techniques
3.1 Read‑Write Splitting (Python example)
# Python example using PyMySQL for read‑write splitting
import pymysql, random
class MySQLPool:
def __init__(self):
# Master (write)
self.master = {
'host': '192.168.1.10',
'user': 'root',
'password': 'password',
'database': 'mydb'
}
# Slaves (read)
self.slaves = [
{'host': '192.168.1.11', 'user': 'root', 'password': 'password', 'database': 'mydb'},
{'host': '192.168.1.12', 'user': 'root', 'password': 'password', 'database': 'mydb'}
]
def get_read_connection(self):
slave_cfg = random.choice(self.slaves)
return pymysql.connect(**slave_cfg)
def get_write_connection(self):
return pymysql.connect(**self.master)3.2 Monitoring Script
#!/bin/bash
# MySQL performance monitoring script
mysql_status() {
mysql -e "SHOW GLOBAL STATUS;" | grep -E "(Connections|Questions|Threads_running|Slow_queries)"
}
innodb_status() {
mysql -e "SHOW ENGINE INNODB STATUS\G" | grep -A 10 "BUFFER POOL AND MEMORY"
}
current_queries() {
mysql -e "SELECT * FROM information_schema.processlist WHERE COMMAND != 'Sleep' ORDER BY TIME DESC;"
}
echo "=== MySQL Performance Monitor ==="
echo "1. Connection Status:"
mysql_status
echo "2. InnoDB Buffer Pool:"
innodb_status
echo "3. Current Running Queries:"
current_queries3.3 Sharding Strategy
Horizontal sharding examples :
-- Time‑based partition
CREATE TABLE user_logs_202501 LIKE user_logs;
CREATE TABLE user_logs_202502 LIKE user_logs;
-- Hash‑based partition
CREATE TABLE user_data_0 LIKE user_data;
CREATE TABLE user_data_1 LIKE user_data;
# Routing logic (Python)
def get_table_name(user_id, table_count=10):
return f"user_data_{user_id % table_count}"Part 4 – Production Experience
4.1 MySQL Fault‑Isolation Workflow
# 1. Check MySQL service status
systemctl status mysql
# 2. View error log
tail -f /var/log/mysql/error.log
# 3. Verify disk space
df -h
# 4. Inspect current connections
mysql -e "SHOW PROCESSLIST;"
# 5. Analyze long‑running queries
mysql -e "SELECT * FROM information_schema.processlist WHERE TIME > 10;"4.2 Backup & Recovery Best Practices
# Hot backup script (logical backup)
DATE=$(date +%Y%m%d_%H%M%S)
BACKUP_DIR="/backup/mysql"
DB_NAME="production_db"
mysqldump --single-transaction --routines --triggers \
--master-data=2 --databases $DB_NAME \
| gzip > $BACKUP_DIR/${DB_NAME}_${DATE}.sql.gz
# Physical backup with Percona XtraBackup (recommended)
xtrabackup --backup --target-dir=$BACKUP_DIR/full_${DATE}4.3 High‑Availability Architecture (Master‑Slave Replication)
Master configuration :
[mysqld]
server-id = 1
log-bin = mysql-bin
binlog-format = ROW
gtid-mode = ON
enforce-gtid-consistency = ONSlave configuration :
[mysqld]
server-id = 2
relay-log = relay-bin
read-only = 1Part 5 – Performance Tuning Toolbox
5.1 Essential Monitoring Tools
Prometheus + Grafana configuration snippet:
# prometheus.yml
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'mysql'
static_configs:
- targets: ['localhost:9104']pt‑stalk for automatic diagnostics :
# Collect diagnostics when MySQL performance degrades
pt-stalk --function=processlist --variable=Threads_running \
--threshold=25 --match-command=Query \
--collect-oprofile --collect-strace --collect-tcpdump5.2 Load‑Testing Tools
sysbench example :
# Prepare test data
sysbench oltp_read_write --table-size=1000000 \
--mysql-host=localhost --mysql-user=root \
--mysql-password=password --mysql-db=testdb prepare
# Run the benchmark
sysbench oltp_read_write --table-size=1000000 \
--mysql-host=localhost --mysql-user=root \
--mysql-password=password --mysql-db=testdb \
--threads=16 --time=300 runConclusion – Core Mindset for MySQL Tuning
Monitor first : Build a complete monitoring system to detect issues early.
Configuration as foundation : Proper parameter settings are the base of performance.
Index is king : Good index design solves the majority of speed problems.
Architecture matters : Read‑write splitting and sharding address high‑concurrency demands.
Continuous improvement : Performance tuning is an ongoing process.
Raymond Ops
Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
