Databases 13 min read

Master MySQL in Production: From Configuration Tuning to SQL Performance Optimization

This comprehensive guide walks you through a real‑world MySQL outage, then details step‑by‑step configuration tweaks, InnoDB parameter tuning, connection and thread settings, index design, query rewrites, monitoring scripts, backup strategies, high‑availability replication, and essential tooling to keep your database fast and reliable.

Raymond Ops
Raymond Ops
Raymond Ops
Master MySQL in Production: From Configuration Tuning to SQL Performance Optimization

Introduction: A Real Production Incident

Last month a core business system suffered massive timeouts; MySQL CPU spiked to 99% and slow queries piled up. After aggressive configuration adjustments and SQL tuning, the service recovered within 30 minutes, highlighting the critical role of MySQL optimization for operations engineers.

Part 1 – MySQL Configuration Tuning

1.1 Memory Settings

# my.cnf core memory settings
[mysqld]
# Buffer pool size: usually 70‑80% of physical RAM
innodb_buffer_pool_size = 8G

# Number of buffer pool instances to improve concurrency
innodb_buffer_pool_instances = 8

# Log buffer to reduce disk I/O
innodb_log_buffer_size = 64M

# Query cache (removed in MySQL 8.0 but shown for older versions)
query_cache_size = 256M
query_cache_type = 1

Practical tip: Determine the optimal innodb_buffer_pool_size by checking the buffer‑pool hit rate; aim for >99%.

1.2 Connection and Thread Optimization

# Connection‑related settings
max_connections = 2000
max_connect_errors = 10000
connect_timeout = 60
wait_timeout = 28800
interactive_timeout = 28800

# Thread cache settings
thread_cache_size = 64
thread_concurrency = 16
max_connections

should be sized to server capacity, not set arbitrarily high.

Monitor Threads_connected and Threads_running to avoid sudden connection spikes.

1.3 InnoDB Core Parameters

# InnoDB core configuration
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit = 2
innodb_log_file_size = 1G
innodb_log_files_in_group = 2
innodb_max_dirty_pages_pct = 75
innodb_io_capacity = 2000
innodb_read_io_threads = 8
innodb_write_io_threads = 8

Performance gain example: Changing innodb_flush_log_at_trx_commit from 1 to 2 increased TPS by ~40% (trade‑off: slightly lower durability).

Part 2 – SQL Performance Optimization

2.1 Index Optimization

Golden rule for composite indexes: Build them according to query patterns, not per column.

-- Bad: single‑column indexes
CREATE INDEX idx_user_id ON orders(user_id);
CREATE INDEX idx_status ON orders(status);
CREATE INDEX idx_create_time ON orders(create_time);

-- Good: composite index matching the WHERE clause
CREATE INDEX idx_user_status_time ON orders(user_id, status, create_time);

2.2 Index Usage Analysis

SELECT object_schema, object_name, index_name, count_star, count_read,
       count_insert, count_update, count_delete
FROM performance_schema.table_io_waits_summary_by_index_usage
WHERE index_name IS NOT NULL
  AND count_star = 0
  AND object_schema = 'your_database'
ORDER BY object_schema, object_name;

2.3 Query Optimization Cases

Case 1 – Pagination on a table with millions of rows

-- Poor pagination (slow)
SELECT * FROM user_logs
WHERE user_id = 12345
ORDER BY created_at DESC
LIMIT 1000000, 20;

-- Optimized pagination (≈100× faster)
SELECT * FROM user_logs
WHERE user_id = 12345
  AND id < (
    SELECT id FROM user_logs
    WHERE user_id = 12345
    ORDER BY created_at DESC
    LIMIT 1000000, 1
  )
ORDER BY created_at DESC
LIMIT 20;

Case 2 – Replacing a sub‑query with a JOIN

-- Slow sub‑query
SELECT * FROM orders o
WHERE o.user_id IN (SELECT u.id FROM users u WHERE u.level = 'VIP');

-- Faster JOIN
SELECT o.* FROM orders o
INNER JOIN users u ON o.user_id = u.id
WHERE u.level = 'VIP';

2.4 Slow‑Query Log Analysis

# Enable slow‑query logging
slow_query_log = ON
slow_query_log_file = /var/log/mysql/slow.log
long_query_time = 1
log_queries_not_using_indexes = ON

Use pt‑query‑digest from Percona Toolkit to generate a report:

# Install Percona Toolkit
sudo apt-get install percona-toolkit

# Analyze the slow‑query log
pt-query-digest /var/log/mysql/slow.log > slow_query_report.txt

Part 3 – Advanced Tuning Techniques

3.1 Read‑Write Splitting (Python example)

# Python example using PyMySQL for read‑write splitting
import pymysql, random

class MySQLPool:
    def __init__(self):
        # Master (write)
        self.master = {
            'host': '192.168.1.10',
            'user': 'root',
            'password': 'password',
            'database': 'mydb'
        }
        # Slaves (read)
        self.slaves = [
            {'host': '192.168.1.11', 'user': 'root', 'password': 'password', 'database': 'mydb'},
            {'host': '192.168.1.12', 'user': 'root', 'password': 'password', 'database': 'mydb'}
        ]
    def get_read_connection(self):
        slave_cfg = random.choice(self.slaves)
        return pymysql.connect(**slave_cfg)
    def get_write_connection(self):
        return pymysql.connect(**self.master)

3.2 Monitoring Script

#!/bin/bash
# MySQL performance monitoring script
mysql_status() {
    mysql -e "SHOW GLOBAL STATUS;" | grep -E "(Connections|Questions|Threads_running|Slow_queries)"
}
innodb_status() {
    mysql -e "SHOW ENGINE INNODB STATUS\G" | grep -A 10 "BUFFER POOL AND MEMORY"
}
current_queries() {
    mysql -e "SELECT * FROM information_schema.processlist WHERE COMMAND != 'Sleep' ORDER BY TIME DESC;"
}

echo "=== MySQL Performance Monitor ==="
echo "1. Connection Status:"
mysql_status

echo "2. InnoDB Buffer Pool:"
innodb_status

echo "3. Current Running Queries:"
current_queries

3.3 Sharding Strategy

Horizontal sharding examples :

-- Time‑based partition
CREATE TABLE user_logs_202501 LIKE user_logs;
CREATE TABLE user_logs_202502 LIKE user_logs;

-- Hash‑based partition
CREATE TABLE user_data_0 LIKE user_data;
CREATE TABLE user_data_1 LIKE user_data;

# Routing logic (Python)
def get_table_name(user_id, table_count=10):
    return f"user_data_{user_id % table_count}"

Part 4 – Production Experience

4.1 MySQL Fault‑Isolation Workflow

# 1. Check MySQL service status
systemctl status mysql

# 2. View error log
tail -f /var/log/mysql/error.log

# 3. Verify disk space
df -h

# 4. Inspect current connections
mysql -e "SHOW PROCESSLIST;"

# 5. Analyze long‑running queries
mysql -e "SELECT * FROM information_schema.processlist WHERE TIME > 10;"

4.2 Backup & Recovery Best Practices

# Hot backup script (logical backup)
DATE=$(date +%Y%m%d_%H%M%S)
BACKUP_DIR="/backup/mysql"
DB_NAME="production_db"

mysqldump --single-transaction --routines --triggers \
    --master-data=2 --databases $DB_NAME \
    | gzip > $BACKUP_DIR/${DB_NAME}_${DATE}.sql.gz

# Physical backup with Percona XtraBackup (recommended)
xtrabackup --backup --target-dir=$BACKUP_DIR/full_${DATE}

4.3 High‑Availability Architecture (Master‑Slave Replication)

Master configuration :

[mysqld]
server-id = 1
log-bin = mysql-bin
binlog-format = ROW
gtid-mode = ON
enforce-gtid-consistency = ON

Slave configuration :

[mysqld]
server-id = 2
relay-log = relay-bin
read-only = 1

Part 5 – Performance Tuning Toolbox

5.1 Essential Monitoring Tools

Prometheus + Grafana configuration snippet:

# prometheus.yml
global:
  scrape_interval: 15s
scrape_configs:
  - job_name: 'mysql'
    static_configs:
      - targets: ['localhost:9104']

pt‑stalk for automatic diagnostics :

# Collect diagnostics when MySQL performance degrades
pt-stalk --function=processlist --variable=Threads_running \
    --threshold=25 --match-command=Query \
    --collect-oprofile --collect-strace --collect-tcpdump

5.2 Load‑Testing Tools

sysbench example :

# Prepare test data
sysbench oltp_read_write --table-size=1000000 \
    --mysql-host=localhost --mysql-user=root \
    --mysql-password=password --mysql-db=testdb prepare

# Run the benchmark
sysbench oltp_read_write --table-size=1000000 \
    --mysql-host=localhost --mysql-user=root \
    --mysql-password=password --mysql-db=testdb \
    --threads=16 --time=300 run

Conclusion – Core Mindset for MySQL Tuning

Monitor first : Build a complete monitoring system to detect issues early.

Configuration as foundation : Proper parameter settings are the base of performance.

Index is king : Good index design solves the majority of speed problems.

Architecture matters : Read‑write splitting and sharding address high‑concurrency demands.

Continuous improvement : Performance tuning is an ongoing process.

Monitoringhigh availabilityMySQLDatabase Configuration
Raymond Ops
Written by

Raymond Ops

Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.