How a Failed Delete Timer Crashed a MySQL PXC Cluster and the Partition‑Based Fix
A large‑scale delete operation in a MySQL PXC cluster caused transaction rollback, table locks, and flow control, leading to service slowdown, and was resolved by redesigning the cleanup mechanism with partitioned tables, stored procedures, and scheduled events.
In performance‑monitoring workloads, massive data imports and rapid table‑space growth require regular cleanup of historical records to free disk space and maintain query efficiency. Simple timer‑based deletions work initially, but as data volume spikes, the timer can fail, holding long‑running table locks that block threads and trigger Galera flow control.
Problem Recap
In early 2018, the OP performance database (a three‑node PXC cluster) experienced a failure when a scheduled DELETE removed two‑day‑old rows from perf_biz_vm. The DELETE attempted to erase ~200 million rows, exceeding the cluster's write‑set limit (1 GB). The transaction committed locally but could not replicate to the other nodes, causing a rollback, prolonged X‑lock on the table, and a cascade of blocked threads that activated System Lock and flow control, slowing cloud‑monitor updates on the North China node.
Failure Mechanism
Galera replicates transactions via write‑sets broadcast to all nodes. Large DELETEs generate massive write‑sets; when the write‑set size surpasses the configured limit, the originating node cannot receive certification from peers, so the transaction is rolled back locally while still holding locks. The resulting queue buildup triggers the cluster’s flow‑control mechanism, which pauses message broadcasting until the slow node’s queue shrinks.
Contributing Factors
The DELETE consumed the entire InnoDB buffer pool (128 GB) and pushed memory usage over the 80 % alert threshold.
During flow control, the table’s X‑lock prevented new monitoring data from being written, further increasing queue length.
Rebuilding the Cleanup Mechanism
To avoid massive DELETEs, the team switched to a partition‑by‑range strategy on the CREATE_TIME column, allowing old partitions to be dropped (DDL) instead of row‑by‑row deletion (DML). The solution consists of:
Step 1: Create a new partitioned table perf_biz_vm_new with daily partitions and appropriate indexes.
Step 2: Rename the old table to perf_biz_vm_old and rename the new table to perf_biz_vm during a low‑traffic window.
Step 3: Copy recent data from the old table back into the new table using a Bash script that runs hourly.
Step 4: Deploy a stored procedure clean_partition that creates future partitions and drops partitions older than a configurable retention period.
Step 5: Create a MySQL event clean_perf_biz_vm that calls the stored procedure daily at 00:30.
Step 6: After data migration, drop the old table perf_biz_vm_old during a maintenance window, freeing ~150 GB of disk space.
The partition‑drop operation completes in minutes and generates far smaller write‑sets, preventing replication bottlenecks and flow control.
Conclusion
The new architecture provides a safe, robust, and efficient cleanup process for large‑scale time‑series tables. While stored procedures simplify automation, they should remain simple; complex business logic belongs in the application layer. In scenarios where partitioning is unsuitable, table sharding may be considered.
Key takeaways include the importance of understanding Galera write‑set limits, avoiding massive DELETEs on large tables, and leveraging partitioning to achieve near‑instant data removal.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
