Databases 18 min read

Understanding MySQL Row‑Based Replication: RelayLog Replay, binlog_row_image, and slave_rows_search_algorithms

This article analyzes MySQL data‑update loss caused by writes on master‑slave pairs, explains BEFORE and AFTER IMAGE handling in row‑based binlogs, examines the binlog_row_image and slave_rows_search_algorithms parameters, and details the implementation of hash, index, and table scans during slave replay.

Aikesheng Open Source Community
Aikesheng Open Source Community
Aikesheng Open Source Community
Understanding MySQL Row‑Based Replication: RelayLog Replay, binlog_row_image, and slave_rows_search_algorithms

1. Problem origin : Users reported data‑update loss when writing to two mutually replicated MySQL servers; analysis showed that writes on both master and slave caused inconsistent data after the slave synchronized back to the master.

2. Row format RelayLog replay : For DML statements MySQL records BEFORE IMAGE (the row before change) and AFTER IMAGE (the row after change). The binlog_row_image variable controls which columns are logged: full (all columns), minimal (only key columns and changed columns), and noblob (all columns except unchanged BLOB/TEXT).

3. slave_rows_search_algorithms : Determines how the slave searches for rows during replay. Available algorithms are INDEX_SCAN, TABLE_SCAN, and HASH_SCAN, which can be combined (e.g., INDEX_SCAN,TABLE_SCAN is the default). The choice affects latency, especially when tables lack primary or unique keys.

4. Source‑code analysis : The function Rows_log_event::decide_row_lookup_algorithm_and_key decides the lookup algorithm and the index to use. Relevant excerpt:

9745 void
9746 Rows_log_event::decide_row_lookup_algorithm_and_key()
9747 {
... ...
9781   /* PK or UK => use LOOKUP_INDEX_SCAN */
9782   this->m_key_index= search_key_in_table(table, cols, (PRI_KEY_FLAG | UNIQUE_KEY_FLAG));
9783   if (this->m_key_index != MAX_KEY)
9784   {
9785     DBUG_PRINT("info", ("decide_row_lookup_algorithm_and_key: decided - INDEX_SCAN"));
9786     this->m_rows_lookup_algorithm= ROW_LOOKUP_INDEX_SCAN;
9787     goto end;
9788   }
... ...
9790 TABLE_OR_INDEX_HASH_SCAN:
... ...
9808 TABLE_OR_INDEX_FULL_SCAN:
... ...
9827 end:
... ...
}

5. Replay functions : do_hash_scan_and_update: Implements hash‑scan lookup, optionally using a secondary index (hash‑scan‑over‑index). It builds a hash table from the BEFORE IMAGE and matches rows during replay. do_index_scan_and_update: Uses primary/unique key or secondary index to locate rows; primary/unique key lookup skips column‑by‑column comparison, while secondary index lookup compares all columns. do_table_scan_and_update: Performs a full table scan when no usable index exists, comparing each row with the BEFORE IMAGE.

6. Key findings :

When a primary or unique key exists, the slave replays the binlog without comparing all columns, so data inconsistency on the slave can be silently overwritten.

Without primary/unique keys, hash‑scan (and hash‑scan‑over‑index) is theoretically more efficient (O(1) look‑ups) than table or index scans (O(n²) comparisons).

The slave selects the first index whose columns are all present in the BEFORE IMAGE; this index may differ from the optimizer’s choice on the master, potentially causing larger replication lag.

7. Conclusion : The article clarifies why data loss occurs in certain replication scenarios, explains how MySQL parameters influence binlog size and replay performance, and provides practical guidance for tuning replication in environments lacking primary keys.

8. References (selected): MySQL documentation for binlog_row_image and slave_rows_search_algorithms, Percona blog on row‑based event performance, and several community articles analyzing slave lag and hash‑scan bugs.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

mysqlReplicationindexHashScanrow-based
Aikesheng Open Source Community
Written by

Aikesheng Open Source Community

The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.