MySQL Group Commit, DDL Fast Fail, GTID & Auto‑Increment Optimizations Explained
This article details MySQL performance enhancements—including Group Commit redesign, DDL fast‑fail lock handling, GTID allocation refinements, persistent auto‑increment storage, and replication parallelism improvements—illustrating the underlying mechanisms, implementation steps, and benchmark results that demonstrate measurable throughput gains.
Background
Since 2010 Taobao has been using MySQL at massive scale, replacing earlier IOE‑based core databases. The database team continuously improves MySQL performance, scalability, and reliability, focusing on single‑node enhancements, cluster extensions, and I/O storage research.
MySQL Performance Optimization – Group Commit
Prior to MySQL 5.6, Binlog and InnoDB commit were serialized, causing a bottleneck when sync_binlog=1. MariaDB introduced a Binlog Group Commit queue, and Oracle MySQL 5.6 adopted a three‑stage approach (flush, sync, commit) that allows concurrent execution.
When sync_binlog is set to 1000, it now means 1000 transaction groups rather than 1000 individual transactions.
XA Recover – MySQL uses MySQL_BIN_LOG as the XA coordinator, scanning the last Binlog file for XIDs and reconciling with InnoDB’s prepared transaction list.
Optimization – By delaying redo‑log writes until after Group Commit’s first stage, the implementation reduces log_sys->mutex contention. The change was merged into MySQL 5.7.6.
Performance Data
并发线程 原生 修改后
32 25600 27000
64 30000 35000
128 33000 39000
256 29800 38000MySQL New Feature – DDL Fast Fail
To avoid full‑system blockage when a DDL statement cannot acquire an exclusive MDL lock, the RDS branch adds a NO_WAIT/WAIT 1 syntax.
alter table test.t1 no_wait/wait 1 add extra int;
If the lock cannot be obtained, the client receives
ERROR: Lock wait timeout exceeded; try restarting transaction. The feature also applies to RENAME, TRUNCATE, DROP, OPTIMIZE, and CREATE INDEX statements.
Compared with Oracle 10g’s ORA‑00054 error, MySQL’s fast‑fail avoids the avalanche effect caused by long‑running MDL locks.
MySQL Performance Optimization – GTID Allocation
GTID (global transaction ID) became available in MySQL 5.6. Automatic allocation stores a placeholder during the flush stage and assigns the final GTID at commit, updating gtid_owned and gtid_executed sets.
Explicit allocation via session.gtid_next directly inserts a chosen GTID into gtid_owned before the transaction runs.
Problem – Maintaining gtid_owned after commit introduces a mutex that degrades TPS under high concurrency.
Solution – Skip the gtid_owned set for automatic allocation; directly add the GTID to gtid_executed, eliminating the lock.
Performance Data
并发线程 原生 修改后
32 24500 25000
64 27900 29000
128 30800 31500
256 29700 32000
512 29300 31700
1024 27000 31000MySQL Bug Fix – InnoDB Auto‑Increment Duplicate Values
After a server restart, InnoDB may reuse auto‑increment values because the in‑memory counter is lost and MySQL recomputes MAX(id)+1. This can cause primary‑key conflicts when historical tables retain the old values.
MyISAM stores the counter persistently in the .MYI file, avoiding the issue.
Fix – Persist the auto‑increment maximum in the clustered index root page’s trx_id field, writing a tiny redo log entry each time the value changes. A new system variable innodb_autoinc_persistent controls this behavior.
Performance Impact
set global innodb_autoinc_persistent=off;
TPS: 22199
set global innodb_autoinc_persistent=on;
TPS: 22003
-- less than 1% overheadAn additional variable innodb_autoinc_persistent_interval allows batching persistence (e.g., every 100 inserts).
MySQL Replication Performance Improvements
From MySQL 5.5 to 5.7, replication evolved from a single‑threaded SQL thread to multi‑threaded parallel replication and finally to a sequence‑number based parallelism in 5.7, dramatically reducing lag.
5.5
Two threads: io_thread pulls binlog to relay log; sql_thread replays events sequentially, causing bottlenecks under high load.
5.6
Introduced a coordinator thread that distributes events to multiple worker threads based on database name, allowing parallel execution for different databases.
sql_thread:
exec_relay_log_event
apply_event_and_update_pos
apply_event
rows_log_event::apply_event
storage_engine operation
update_pos5.7
Added a second stage in ordered_commit that tags a batch of commits with the same sequence number, enabling truly parallel execution of transactions sharing the same timestamp, regardless of database.
Overall, parallel replication in 5.6 and 5.7 improves throughput but introduces new bugs (memory leaks, relay‑log issues, transaction loss) that have been tracked and fixed.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
