Databases 30 min read

Mastering MySQL: History, Pitfalls, Best Practices, and Advanced Ops

This comprehensive guide covers MySQL's evolution, its strengths and limitations, detailed development and operational standards, backup and recovery strategies, replication techniques, performance tuning, and practical Q&A for database administrators seeking to optimize large‑scale MySQL deployments.

21CTO

Aug 10, 2015

Mastering MySQL: History, Pitfalls, Best Practices, and Advanced Ops

Preface

MySQL is widely known, and after Alibaba’s "去IOE" movement, MySQL has attracted increasing attention.

MySQL History

1979: Monty Widenius wrote the first version, released 1.0 in 1996.

1995‑2000: MySQL AB founded, introduced BDB.

April 2000: Integrated MyISAM and replication.

2001: Heikki Tuuri suggested integrating InnoDB.

2003: Released 5.0 with views, stored procedures, etc.

2008: MySQL AB acquired by Sun, 5.1 released in 2009.

April 2009: Oracle acquired Sun; 5.5 released Dec 2010.

Feb 2013: 5.6 GA released, 5.7 in development.

Advantages of MySQL

Easy to use

Open‑source and free

Good scalability at certain stages

Active community

Performance meets most internet storage and latency requirements when supported by proper hardware

These factors explain why many companies choose MySQL, although it also has numerous limitations.

Problems of MySQL

Optimizer struggles with complex SQL

Poor SQL‑standard compliance

Large‑scale clustering solutions are immature, especially middleware

Global auto‑increment ID generation issues

Asynchronous logical replication poses data‑safety concerns

Online DDL support is limited

HA solutions are incomplete

Backup and recovery are complex and rely on external tools

Insufficient diagnostic information for users

Proliferation of forks makes selection difficult

DBAs must address many of these challenges during operations.

Database Development Standards

Development standards are a set of recommendations or rules defined by DBAs for internal projects.

Purpose of Standards

Ensure online schema consistency

Reduce the probability of errors

Facilitate automation

Provide a win‑win for developers and DBAs

Without standards, developers may write full‑table scans or bizarre SQL, leading to instability and constant firefighting for DBAs.

Basic Naming and Constraint Standards

Use UTF8 charset; for emojis use UTF8mb4 (MySQL 5.5.3+)

Storage engine: InnoDB

Prefer VARCHAR/VARBINARY for variable‑length strings

Do not store images or files in the database

Keep single‑table row count below 100 million

Avoid reserved words in database, table, column, and index names

Use lowercase letters and underscores for identifiers, making them self‑descriptive

Keep object names concise

Field Standards

Define all columns as NOT NULL unless NULL is truly needed

Choose the smallest suitable data type; use UNSIGNED for non‑negative integers

Store timestamps with TIMESTAMP type

Use VARCHAR for variable strings (note that the length parameter is character count, not bytes); store IPv4 addresses as UNSIGNED INT instead of CHAR(15)

Prefer DECIMAL for precise floating‑point numbers

Avoid BLOB/TEXT when possible

Why avoid NULL

It wastes storage because InnoDB needs an extra byte per NULL column.

Excessive NULL defaults can mislead the optimizer’s execution‑plan choice.

Since MySQL 5.6.4, the storage difference between DATETIME and TIMESTAMP has become negligible, and TIMESTAMP’s range is limited to 2038.

Index Standards

Limit a single index to no more than five columns; limit total indexes per table to five; follow B‑Tree left‑most prefix rule.

Choose high‑cardinality columns for indexes.

Design indexes to cover ~80% of main queries.

Index columns used in DML, ORDER BY, and GROUP BY.

Avoid implicit type conversion in indexes.

Avoid redundant indexes.

Indexes accelerate reads but add write overhead and lock contention, so excessive indexes should be avoided.

Redundant index example

idx_abc(a,b,c)

idx_a(a) – redundant

idx_ab(a,b) – redundant

Implicit conversion example

Column: remark VARCHAR(50) NOT NULL Query with integer literal causes full‑table scan because the value type does not match the column type.

SQL Class Standards

Avoid stored procedures, triggers, and functions when possible.

Avoid large‑table JOINs; MySQL’s optimizer handles them poorly.

Do not perform heavy calculations inside the database.

Combine multiple VALUES in DML to reduce round‑trips.

Implement efficient pagination, especially for large offsets.

Never use LIMIT with UPDATE/DELETE to prevent master‑slave inconsistency.

Database Operations Standards

Operation Content

SQL, DDL review and timing, especially Online DDL.

High‑risk operation checks; backup before DROP.

Permission control and auditing.

Log analysis (slow‑query and error logs).

High‑availability solutions.

Backup strategies.

Version Selection

MySQL Community Edition – largest user base.

MySQL Enterprise Edition – commercial.

Percona Server – many new features.

MariaDB – less popular domestically.

Recommended priority: Community > Percona > MariaDB > Enterprise.

When using RDS, the community edition is usually the default.

Online DDL Issues

Native MySQL DDL locks tables, blocking writes; large tables make DDL painful. Solutions include Facebook OSC and Percona’s pt‑online‑schema‑change.

Facebook OSC and Percona’s pt‑online‑schema‑change provide non‑blocking schema changes.

No write blockage.

Comprehensive condition checks and load‑delay controls.

Limitations of pt‑online‑schema‑change: longer alteration time, requires a unique or primary key, and limited concurrent modifications per port.

Availability

Seamless master switch using slave start until and a dual‑master setup, handling auto‑increment concerns.

MySQL Cluster Solutions

Organize MySQL instances into clusters.

Most deployments still rely on native MySQL replication.

Native master‑slave sync has performance and safety drawbacks.

MySQL Semi‑Synchronous Replication

Introduced in 5.5, it improves data safety by requiring ACK from at least one replica.

From MySQL 5.7 onward, multi‑replica semi‑sync is configurable.

Master‑Slave Delay Issues

Read‑write splitting often makes the replica busier than the master, causing lag.

Solutions

Identify the bottleneck.

If I/O bound, upgrade hardware (e.g., SSD).

If not I/O/CPU, consider parallel replication.

If still problematic, evaluate sharding.

InnoDB Optimization

InnoDB is a mature ACID‑compliant engine with MVCC, row‑level locking, and O_DIRECT support.

innodb_file_per_table=1

innodb_buffer_pool_size set according to data size and memory

innodb_flush_log_at_trx_commit=0/1/2

innodb_log_file_size larger values improve performance

innodb_page_size configurable

innodb_flush_method=O_DIRECT

innodb_io_capacity tuned for SSD

innodb_buffer_pool_dump and at_shutdown dump

Key InnoDB features: buffer‑pool warm‑up, dynamic resizing (5.7+), custom page size, compression, transportable tablespaces, full‑text, GIS, etc.

InnoDB on SSD

Increase innodb_write_io_threads and innodb_read_io_threads (5.5+).

Raise innodb_io_capacity.

Place redo logs on HDD, undo on SSD (optional).

Enable atomic writes, skip doublewrite buffer.

Leverage InnoDB compression.

Run multiple instances per host.

TokuDB

Transactional, MVCC‑enabled storage engine.

Fractal Tree Index – ideal for write‑heavy workloads.

High compression ratios and native Online DDL.

Supported by major forks; comparable to InnoDB.

Issues: limited official support, enterprise‑only hot‑backup, occasional bugs, and slower recovery for very large tables.

MySQL Optimization Cases

Query cache is largely obsolete due to design flaws.

InnoDB default isolation is REPEATABLE READ; switching to READ COMMITTED can improve concurrency.

Classic deadlock example illustrated.

About SSD

Flash storage has been the biggest performance driver for databases in the past decade; RAID5/10 configurations on SATA or PCIe SSDs are recommended.

Large Table Challenges

Example: a 60 billion‑row (≈1.2 TB) table with many indexes. Historical neglect of sharding caused this size; performance was still acceptable, but future growth demands partitioning.

DBA "Lazy" Experiments

Testing InnoDB limits (e.g., adding a column/index to a 1.2 TB table) is risky; using TokuDB with LZMA compression reduced a 1 TB+ table to ~80 GB, enabling online DDL.

Q&A

Q1: Does using USE schema; SELECT ... differ from SELECT ... FROM schema.table for replication? Impact on replication is minimal, but replication filters require careful configuration (e.g., ReplicateWildIgnoreTable).

Q2: SSD configuration recommendations? Use SATA SSDs in RAID5/50/10; see the accompanying diagram.

Q3: How to enforce database standards? Provide regular training and embed standards into automated review tools.

Q4: How to maximize InnoDB hit rate? Ensure sufficient memory for hot data, avoid full‑table scans, and keep queries cache‑friendly.

Q5: Should CAS operations read from the master? Yes, to avoid stale reads caused by replication lag.

Q6: Is a national standard necessary for DB development? National standards are academic; practical engineering may not benefit directly.

Q7: Can master‑slave clusters be further refined? Typically use one‑master‑multiple‑slaves per sub‑cluster, with monitoring, backup, and HA per sub‑cluster.

Q8: How to track changes to a specific column? Analyze row‑based binlogs with custom tools.

Q9: Advice on sharding large tables with middleware? Experience is limited; 360’s Atlas is a viable open‑source option.

Q10: Why does MySQL proxy not load‑balance under low load? Low traffic may not trigger noticeable load distribution.

Q11: Why recommend ROW binlog format over MIXED? ROW provides stronger data consistency; MIXED can be unpredictable.

Q12: Should read‑write splitting be done in code or via proxy? Implementing in application code offers better decoupling; choose proxy carefully.

Q13: When to use MySQL thread pool and best‑practice settings? Short‑lived connections benefit from thread pools; specific parameters depend on workload.

Q14: Data recovery steps after accidental deletion (including from replica)? Identify deletion time, restore from backup to a new instance, replay binlogs up to the desired point, then apply the recovered data.

Q15: Recommended backup method for large data sets? Physical hot backup with XtraBackup; logical backup is useful for single‑table restores.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance Optimization MySQL Database Administration backup and recovery

Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.