Deep Dive into MySQL Seconds_Behind_Master Calculation in Dual‑Master Replication
This article investigates why the Seconds_Behind_Master metric fluctuates in a dual‑master MySQL setup, explains the underlying replication logic, reproduces the issue, and analyses the handling of ROTATE_EVENT and other binlog events in both parallel and non‑parallel replication modes using source code excerpts.
The author, a senior database systems engineer at NetEase Games, presents an original analysis of a dual‑master MySQL cluster where the Seconds_Behind_Master (SBM) value jumps from 0 to large numbers (e.g., 10000) despite normal operation.
Problem background : The cluster uses a two‑node active‑active (dual‑master) configuration with VIP failover. All writes and reads go to the primary node da , which replicates to the secondary node dp . Monitoring shows frequent SBM spikes.
Investigation conclusion : The SBM calculation in sql/rpl_slave.cc depends on the last_master_timestamp variable. When Exec_Master_Log_Pos is less than Read_Master_Log_Pos , the code enters the else branch and computes a time difference based on the last ROTATE_EVENT timestamp.
Reproduction steps :
On dp , run FLUSH LOGS ;
On da , stop the slave SQL thread ( STOP SLAVE SQL_THREAD );
Execute DML on da (observe Read_Master_Log_Pos jumping ahead);
Start the slave ( START SLAVE ) and watch SBM increase.
Analysis of ROTATE_EVENT handling :
When dp flushes logs, a ROTATE_EVENT is generated. The IO thread receives it ( handle_slave_io ) and queues it to the relay log. The relevant code (excerpted below) shows the event being logged:
handle_slave_io: info: IO thread received event of type RotateIn sql/rpl_slave.cc (line 5819) the event is queued:
event_buf = (const char*)mysql->net.read_pos + 1;The SQL thread later processes the event via exec_relay_log_event . For non‑parallel replication, last_master_timestamp is set to the sum of the event header timestamp and exec_time :
rli->last_master_timestamp = ev->common_header->when.tv_sec + (time_t)ev->exec_time;In parallel replication the logic is more complex. The coordinator updates a low‑water‑mark (LWM) and uses reset_notified_checkpoint to set last_master_timestamp only when a real event (non‑artificial, non‑zero server_id) is processed:
if (rli->is_parallel_exec()) {
bool real_event = server_id && !is_artificial_event();
rli->reset_notified_checkpoint(0,
real_event ? common_header->when.tv_sec + (time_t)exec_time : 0,
true, real_event);
}For ROTATE_EVENT in parallel mode, process_io_rotate updates the master info, and Rotate_log_event::do_update_pos calls reset_notified_checkpoint with the ROTATE timestamp.
Handling of binlog events with matching server_id :
The IO thread discards events whose server_id equals the local server_id (unless replicate_same_server_id is true). It still advances master_log_pos and signals the SQL thread:
queue_event: info: master_log_pos: 219, event originating from 236 server, ignoredThe SQL thread, when encountering such ignored segments, generates a synthetic ROTATE_EVENT with server_id = 0 to advance positions, but this event does not update last_master_timestamp because real_event is false.
Summary of findings :
In parallel replication, last_master_timestamp reflects the time of the most recent ROTATE_EVENT; when Exec_Master_Log_Pos lags behind Read_Master_Log_Pos , SBM spikes to the difference between the current time and that ROTATE timestamp.
In non‑parallel replication, last_master_timestamp is reset to 0 on each next_event call, so SBM does not exhibit the same spikes.
Conclusion : The observed SBM jumps are caused by the way MySQL updates last_master_timestamp during ROTATE_EVENT processing, and the behavior differs between parallel and non‑parallel replication modes.
References (selected):
MySQL 5.7 MTS source analysis
MySQL replication Q&A on Seconds_Behind_Master
Various blog posts and official MySQL documentation on binlog events and replication internals
Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.