Databases 15 min read

Why Did MySQL Slave SQL Thread Stop? Diagnosing Relay Log Corruption and Restoring Replication

This guide walks through a real‑world MySQL master‑slave sync failure caused by a corrupted relay log, explains the replication mechanics, shows how to analyze slave status, use GTID to skip the bad transaction, and fully restore synchronization with step‑by‑step commands and screenshots.

ITPUB
ITPUB
ITPUB
Why Did MySQL Slave SQL Thread Stop? Diagnosing Relay Log Corruption and Restoring Replication

Symptom

The slave’s SQL thread stopped with the error

Relay log read failure: Could not parse relay log event entry

. This means the slave cannot apply events from the master’s binary log.

Replication Architecture

MySQL replication uses two threads on the slave:

Slave_IO_Running : connects to the master, reads the master’s binlog, and writes the events to a local relay log.

Slave_SQL_Running : reads the relay log, parses each event into SQL statements, and executes them.

The master runs a dump thread that streams the binlog to the slave’s I/O thread.

Troubleshooting Steps

1. Analyse slave status

Run SHOW SLAVE STATUS\G and examine the key fields: Master_Log_File – current binlog file on the master (e.g., mysql-bin.000956). Read_Master_Log_Pos – position read from the master binlog. Relay_Log_File – current relay‑log file on the slave. Relay_Log_Pos – position within the relay log. Relay_Master_Log_File – binlog file that the slave has replayed. Exec_Master_Log_Pos – position executed from the master binlog. Slave_IO_RunningYes means the I/O thread is active. Slave_SQL_RunningNo indicates the SQL thread has stopped.

If Master_Log_File differs from Relay_Master_Log_File or Read_Master_Log_Pos differs from Exec_Master_Log_Pos, replication is out of sync.

2. Restart attempts

Both of the following were tried and failed: STOP SLAVE; followed by START SLAVE; Restarting the MySQL instance or container.

3. Inspect master binlog

On the master run: mysqlbinlog /var/lib/mysql/log/mysql-bin.000955 No parsing errors were reported.

4. Inspect slave relay log

On the slave open the relay log that stopped ( relay-bin.000094) with: mysqlbinlog /var/lib/mysql/log/relay-bin.000094 The output contains:

ERROR: Error in Log_event::read_log_event(): 'read error', data_len: 7644, event_type: 31
ERROR: Could not read entry at offset 243899899: Error in log format or read error.

This indicates the relay log file is corrupted.

5. GTID analysis

The offending binlog entry contains GTID c5d74746-d7ec-11ec-bf8f-0242ac110002:8634832. The slave’s Executed_Gtid_Set ends at 8634831, showing a one‑off gap.

6. Root cause

The SQL thread stopped because it could not parse a corrupted relay‑log file.

GTID‑Based Replication Overview

In GTID mode the master computes the set difference between its GTID collection (x) and the slave’s collection (y) and streams only the missing transactions (x − y). The slave’s I/O thread writes those events to a new relay log, and the SQL thread executes them. This eliminates manual position handling.

Restoring Slave Synchronisation

1. Check GTID progress

Run SHOW SLAVE STATUS\G and note: Retrieved_Gtid_Set – all GTIDs received by the slave (e.g., 1‑9101426). Executed_Gtid_Set – GTIDs already executed (e.g., 1‑8634831).

The next GTID to apply is 8634832.

2. Skip the corrupted GTID

Reset the slave to discard the broken relay log and then inject the missing GTID as an empty transaction:

STOP SLAVE;
RESET SLAVE;
SET GTID_NEXT='c5d74746-d7ec-11ec-bf8f-0242ac110002:8634832';
BEGIN;
COMMIT;
SET GTID_NEXT=AUTOMATIC;
START SLAVE;

The empty transaction adds GTID 8634832 to the slave’s executed set without replaying the corrupted event.

3. Verify replication

After START SLAVE, both Slave_IO_Running and Slave_SQL_Running should show YES. Monitor Seconds_Behind_Master until it reaches 0, confirming full synchronization.

Conclusion

When the slave’s SQL thread stops with “relay log read failure”, the primary cause is usually relay‑log corruption. Verify the integrity of the relay log, use GTID to skip the offending transaction, reset the slave, and restart replication. This restores normal master‑slave synchronization without data loss.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

databasemysqlReplicationGTIDRelay Log
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.