Understanding MySQL Online DDL: Process, Row Log Management, and Log Replay
This article explains MySQL's online DDL mechanism, detailing the copy and inplace methods, the three execution phases, the allocation and structure of row_log_t, how log entries are written during DDL, and the replay process that applies those logs to the rebuilt table.
MySQL supports online DDL through two approaches: copy and inplace. Inplace DDL is further divided into rebuild (requiring table reorganization, e.g., adding indexes or columns) and no‑rebuild (metadata‑only changes). The online DDL workflow consists of three major stages: prepare, execution, and commit, with the rebuild path adding an extra execution phase.
Prepare stage: MySQL creates a temporary .frm file, acquires an EXCLUSIVE‑MDL lock, determines the execution method (copy, online‑rebuild, or online‑norebuild), updates in‑memory data‑dictionary objects, allocates a row_log object for incremental changes, and generates a temporary .ibd file.
Execution stage: The EXCLUSIVE‑MDL lock is downgraded to allow reads/writes, the old table’s clustered index is scanned, new clustered and secondary indexes are built record by record, index entries are constructed and inserted into a sort buffer, and the buffer is flushed into the new indexes. For rebuild operations, any accumulated increments are also processed.
Commit stage: The lock is upgraded back to EXCLUSIVE‑MDL, the remaining row‑log increments are reapplied, the InnoDB data‑dictionary is updated, the transaction is committed (redo log flushed), statistics are refreshed, temporary files are renamed, and the operation completes.
The row log is managed by struct row_log_t, which is part of struct dict_index_t:
struct dict_index_t{
...
row_log_t* online_log; /*!< the log of modifications during online index creation; valid when online_status is ONLINE_INDEX_CREATION */
...
}Allocation of a row_log_t occurs when a rebuild is needed for the clustered index or when a new secondary index (excluding full‑text) is created. The allocation call stack is:
mysql_alter_table → handler::ha_prepare_inplace_alter_alter_table → ha_innobase::prepare_inplace_alter_table → prepare_inplace_alter_table_dict → row_log_allocateThe row_log_t structure is defined as:
struct row_log_t {
int fd; /*!< file descriptor */
ib_mutex_t mutex; /*!< protects error, max_trx and tail */
page_no_map* blobs; /*!< map of off‑page column pages freed during rebuild */
dict_table_t* table; /*!< table being rebuilt or NULL for secondary index */
bool same_pk; /*!< whether PRIMARY KEY definition unchanged */
const dtuple_t* add_cols; /*!< default values for added columns */
const ulint* col_map; /*!< mapping old column IDs to new IDs */
dberr_t error; /*!< error during online rebuild */
trx_id_t max_trx; /*!< largest trx_id observed */
row_log_buf_t tail; /*!< writer context */
row_log_buf_t head; /*!< reader context */
ulint n_old_col; /*!< number of non‑virtual columns in old table */
ulint n_old_vcol; /*!< number of virtual columns in old table */
const char* path; /*!< temporary file path for log data */
};The writer buffer row_log_buf_t used by tail is:
struct row_log_buf_t {
byte* block; /*!< file block buffer */
ut_new_pfx_t block_pfx; /*!< opaque descriptor for block */
mrec_buf_t buf; /*!< buffer for records spanning two blocks */
ulint blocks; /*!< current block count */
ulint bytes; /*!< current byte offset within block */
ulonglong total; /*!< logical byte position from start of log */
};During DDL, modifications (INSERT, UPDATE, DELETE) are recorded via the functions:
row_log_table_insert();
row_log_table_update();
row_log_table_delete();Writing a log entry involves calculating its length, calling row_log_table_open() to obtain a write position, constructing the record, and finally invoking row_log_table_close(). The open/close functions manage the row_log_buf_t buffers, allocate temporary files when needed, and update the total byte count.
Log replay occurs twice: once during the DDL execution phase and again in the commit phase. The head buffer provides a read‑only context that never overtakes tail. The replay routine row_log_table_apply() acquires an X‑lock on the index, then calls row_log_table_apply_ops(), which iterates over blocks and records, handling cross‑block records with the auxiliary buf. Individual records are applied by row_log_table_apply_op(), which dispatches to row_log_table_apply_insert(), row_log_table_apply_delete(), or row_log_table_apply_update() based on the operation type.
In summary, the article outlines the overall flow of MySQL online DDL, the internal row‑log structures, how incremental changes are captured during the rebuild, and the mechanisms that safely replay those changes without interfering with concurrent writes.
Tencent Database Technology
Tencent's Database R&D team supports internal services such as WeChat Pay, WeChat Red Packets, Tencent Advertising, and Tencent Music, and provides external support on Tencent Cloud for TencentDB products like CynosDB, CDB, and TDSQL. This public account aims to promote and share professional database knowledge, growing together with database enthusiasts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
