Deep Dive into MySQL 8.0.33 Two‑Phase Commit: Source‑Code Analysis
This article provides a detailed source‑code walkthrough of MySQL 8.0.33’s two‑phase commit process, explaining the Prepare and Commit stages, internal structures such as binlog queues, GTID generation, redo‑log flushing, sync handling, and the interactions between InnoDB and the binary log.
MySQL's commit command triggers a two‑phase commit (Prepare and Commit). This article analyzes the implementation in MySQL 8.0.33, showing the complete logic flow and the underlying source code.
Overall Logic
The process consists of a Prepare phase followed by a Commit phase, each divided into several sub‑stages that coordinate binlog handling, GTID generation, redo‑log flushing, and synchronization.
Prepare Phase
1. Binlog Prepare
Retrieves the largest sequence number timestamp from the previous transaction.
2. InnoDB Prepare
Set transaction state to prepared .
Release GAP locks for RC and lower isolation levels.
Change the undo‑log segment state from TRX_UNDO_ACTIVE to TRX_UNDO_PREPARED .
Write the transaction XID into the undo log.
Commit Phase
1. Stage 0
Ensures commit order for replica instances.
2. Flush Stage
Flush redo logs according to innodb_flush_log_at_trx_commit .
Clear BINLOG_FLUSH_STAGE and COMMIT_ORDER_FLUSH_STAGE queues.
Write prepare state redo logs based on the same flush parameter.
Generate GTID via get_server_sidno() and Gtid_state::get_automatic_gno() .
Flush binlog caches (stmt, trx) and update last_committed , sequence_number , and GTID log events.
Register flushed binlog file/position with semi‑synchronous replication plugins.
If sync_binlog != 1 , update binlog position and broadcast an update signal.
3. Sync Stage
Wait according to sync_binlog and invoke fsync() to persist the binlog file.
If sync_binlog == 1 , update binlog position and broadcast the update signal.
4. Commit Stage
Execute after_sync hook (for semi‑sync replication).
Update global m_max_committed_transaction and initialize the transaction’s sequence number.
Binlog layer commit (no‑op).
Storage‑engine commit, which includes: Allocate update undo segment for persistent GTID. Update table update_time in the data dictionary. Allocate mini‑transaction handle and buffer. Set undo state (e.g., TRX_UNDO_TO_FREE for inserts, TRX_UNDO_TO_PURGE for updates). Append undo‑log header to the history list. Record binlog position in the system‑transaction table. Close MVCC read view, persist GTID, release undo logs, and wake background threads (master, purge, page cleaner).
Update the executed GTID group.
Decrease the prepared XID counter after engine commit.
Broadcast m_stage_cond_binlog to wake waiting followers.
Stage Transition Logic
The function change_stage (which calls enroll_for ) manages the movement of threads between stages. The first thread in a queue becomes the leader; subsequent threads become followers and wait on condition variables until the leader finishes its work. Special handling ensures that the binlog flush leader and the commit‑order leader cooperate without deadlock.
Key Code Snippets
int ha_commit_trans(THD *thd, bool all, bool ignore_global_read_lock) {
// ...
// Prepare phase
if (!trn_ctx->no_2pc(trx_scope) && (trn_ctx->rw_ha_count(trx_scope) > 1))
error = tc_log->prepare(thd, all);
// Commit phase
if (error || (error = tc_log->commit(thd, all))) {
ha_rollback_trans(thd, all);
error = 1;
goto end;
}
} int MYSQL_BIN_LOG::ordered_commit(THD *thd, bool all, bool skip_commit) {
// Stage #0: preserve commit order for replicas
if (Commit_order_manager::wait_for_its_turn_before_flush_stage(thd) ||
ending_trans(thd, all) ||
Commit_order_manager::get_rollback_status(thd)) {
if (Commit_order_manager::wait(thd)) {
return thd->commit_error;
}
}
// Stage #1: flush to binary log
if (change_stage(thd, Commit_stage_manager::BINLOG_FLUSH_STAGE, thd, nullptr, &LOCK_log)) {
return finish_commit(thd);
}
// ... flush logic ...
// Stage #2: sync to disk
if (change_stage(thd, Commit_stage_manager::SYNC_STAGE, wait_queue, &LOCK_log, &LOCK_sync)) {
return finish_commit(thd);
}
// Stage #3: commit all transactions in order
if (change_stage(thd, Commit_stage_manager::COMMIT_STAGE, final_queue, &LOCK_sync, &LOCK_commit)) {
return finish_commit(thd);
}
// ... commit logic ...
return thd->commit_error == THD::CE_COMMIT_ERROR;
}Keywords
#MySQL #事务 #源码
Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.