Databases 10 min read

Using pt-online-schema-change to Quickly Clean Data and Resolve Disk‑Space Alerts in MySQL

This article explains how to handle MySQL disk‑space alerts by using the pt‑online‑schema‑change (pt‑osc) tool to selectively retain needed rows, migrate data to a temporary table, and drop the original table, thereby achieving fast, low‑impact data cleanup.

Aikesheng Open Source Community
Aikesheng Open Source Community
Aikesheng Open Source Community
Using pt-online-schema-change to Quickly Clean Data and Resolve Disk‑Space Alerts in MySQL

1 Preface

Disk space alert is a common situation for DBAs during MySQL operations. In a recent inspection the author encountered such an alert and uses this opportunity to demonstrate how to quickly clean data with the pt-osc (pt‑online‑schema‑change) tool.

Before cleaning data, always coordinate with the business owners to understand the impact.

2 Decision

After communication it was discovered that a large table only needs to keep rows matching a specific condition, which may represent less than 10% of the total data.

Can we delete the unwanted rows directly?

Deleting a massive amount of rows in one transaction can cause large transactions and affect master‑slave replication. Some operators prefer pt-archiver or similar tools to delete data in small batches, but this approach is slow and generates many binlog entries.

Can we try a different approach?

The proposed method focuses on the data that must be retained. First, copy the required rows into a temporary table, then swap the tables and drop the old one. This method is fast and does not generate large binlog traffic, but it requires a brief business pause because new writes cannot be synchronized to the temporary table in real time.

Why choose pt‑osc?

The logic is similar to Online DDL: migrate data and then switch tables, but pt‑osc also handles synchronization of newly written rows, so a skilled user could implement a custom tool, yet the latest pt‑osc version already provides this functionality.

Percona‑toolkit v3.6.0 adds the --where parameter to pt‑osc, allowing you to copy only rows that satisfy a condition.

3 Practical Steps

Prepare a table with data covering the range 2025‑02‑01 to 2025‑02‑10.

mysql> select time,count(*) from test group by time;
+---------------------+----------+
| time                | count(*) |
+---------------------+----------+
| 2025-02-01 00:00:00 |    10000 |
| 2025-02-02 00:00:00 |    10000 |
| 2025-02-03 00:00:00 |    10000 |
| 2025-02-04 00:00:00 |    10000 |
| 2025-02-05 00:00:00 |    10000 |
| 2025-02-06 00:00:00 |    10000 |
| 2025-02-07 00:00:00 |    10000 |
| 2025-02-08 00:00:00 |    10000 |
| 2025-02-09 00:00:00 |    10000 |
| 2025-02-10 00:00:00 |    10000 |
| 2025-02-11 00:00:00 |    10000 |
| 2025-02-12 00:00:00 |    10000 |
+---------------------+----------+

If we only need to keep the rows of 2025‑02‑12, we can use the following condition:

--where 'time>="2025-02-12"' . When using --where , you must also specify --force unless you also provide --no-drop-new-table and --no-swap-tables . The DDL operation only needs engine=innodb .

[root@192-168-13-131 bin]# ./pt-online-schema-change -u root -p 123456 -h 10.186.60.84 -P 3306 --execute --force --where 'time>="2025-02-12"' --alter 'engine=innodb' D=kk,t=test
Found 1 slaves:
10-186-60-87 -> 10.186.60.87:3306
Will check slave lag on:
10-186-60-87 -> 10.186.60.87:3306
Operation, tries, wait:
analyze_table, 10, 1
copy_rows, 10, 0.25
create_triggers, 10, 1
drop_triggers, 10, 1
swap_tables, 10, 1
update_foreign_keys, 10, 1
Altering `kk`.`test`...
Creating new table...
Created new table kk._test_new OK.
Altering new table...
Altered `kk`.`_test_new` OK.
2025-02-13T14:47:50 Creating triggers...
2025-02-13T14:47:51 Created triggers OK.
2025-02-13T14:47:51 Copying approximately 109467 rows...
2025-02-13T14:48:02 Copied rows OK.
2025-02-13T14:48:02 Analyzing new table...
2025-02-13T14:48:02 Swapping tables...
2025-02-13T14:48:03 Swapped original and new tables OK.
2025-02-13T14:48:03 Dropping old table...
2025-02-13T14:48:03 Dropped old table `kk`.`_test_old` OK.
2025-02-13T14:48:03 Dropping triggers...
2025-02-13T14:48:03 Dropped triggers OK.
Successfully altered `kk`.`test`.

After the operation, only the rows of 2025‑02‑12 remain. If you need to keep the original data, add the --no-drop-old-table option.

mysql> select time,count(*) from test group by time;
+---------------------+----------+
| time                | count(*) |
+---------------------+----------+
| 2025-02-12 00:00:00 |    10000 |
+---------------------+----------+

Note that pt‑osc creates triggers to sync incremental data. If rows that do not satisfy the --where condition are written or updated during the migration, they will also be copied to the new table; you can decide whether to keep or delete them.

Reference

[1] pt‑osc: https://docs.percona.com/percona-toolkit/pt-online-schema-change.html

[2] pt‑archiver: https://docs.percona.com/percona-toolkit/pt-archiver.html

MySQLData CleanupDatabase OperationsOnline Schema Changedisk spacept-osc
Aikesheng Open Source Community
Written by

Aikesheng Open Source Community

The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.