Databases 7 min read

Why Does a DELETE in a Stored Procedure Trigger Massive MySQL Replication Lag?

A detailed case study shows how a DELETE statement executed inside a MySQL stored procedure can switch binlog format to ROW, causing billions of bitmap operations on a partitioned table and resulting in severe master‑slave replication delay, and presents three practical solutions.

dbaplus Community
dbaplus Community
dbaplus Community
Why Does a DELETE in a Stored Procedure Trigger Massive MySQL Replication Lag?

Root Cause of Replication Lag

A user observed a large master‑slave lag: show slave status\G reported a gap of over 60 binlog files. The Retrieved_Gtid_Set kept increasing while Executed_Gtid_Set stalled, pinpointing a massive transaction that deleted rows from table A.

The Relay_Log_File was relay-bin.000010 at position 95133771. The table involved was a large partitioned table with hundreds of columns, a primary key, and many partitions, yet the lag persisted.

Investigation revealed that the DELETE was executed inside a stored procedure. When run directly, the DELETE is logged in STATEMENT format and causes no lag. Inside the procedure, MySQL rewrites the statement, adds a NAME_CONST for the local variable B_DATE, and forces ROW binlog format, which dramatically slows replay.

Why ROW Binlog Is Slow for This Delete

The DELETE affects roughly 3 million rows across 7 200 partitions. During replay, the SQL thread repeatedly calls bitmap_get_next_set() for each partition, resulting in about 2.16 billion calls. Profiling with pstack and perf confirmed the bottleneck in this bitmap function.

Further code inspection showed that try_semi_consistent_read() invokes get_next_used_partition() for every partition, amplifying the cost.

Solutions

Force the session binlog_format=STATEMENT inside the stored procedure so the DELETE is logged as a statement, avoiding the ROW‑format bug.

Patch the MySQL kernel to fix the underlying issue.

Replace the stored‑procedure‑based deletion with an external shell script that runs the DELETE directly.

By applying the first solution—setting binlog_format=STATEMENT within the procedure—the replication lag was eliminated.

delete from xxxxx
where update_datetime < DATE_ADD(B_DATE,INTERVAL -1 day)
and DATE_FORMAT(update_datetime,'%i') not in ('00','05','10','15','20','25','30');
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

databasemysqlBinlogReplicationStored Procedure
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.