Databases 7 min read

How to Efficiently Clean and Partition a 200M‑Row MySQL Table Using Online DDL

This article explains how to handle a 200‑million‑row MySQL 5.6 table by adding indexes, cleaning 99% of old data, converting it to a partitioned table, and using online DDL and auxiliary tables to maintain continuous business operations.

ITPUB
ITPUB
ITPUB
How to Efficiently Clean and Partition a 200M‑Row MySQL Table Using Online DDL

Business Requirement Overview

The original system used Oracle, but the new requirement is to manage a MySQL 5.6 table containing over 200 million rows. Business users need to keep only the most recent data for an upcoming activity, deleting roughly 99% of old rows while preserving the ability to run statistics every ten minutes and ensuring continuous data ingestion.

Key Tasks Identified

Optimize queries by adding an index on the time‑range column.

Clean up the majority of old (cold) data.

Maintain business sustainability with frequent statistical analysis.

Convert the table to a partitioned layout, separating old and new data so that old partitions can be dropped quickly.

Using MySQL Online DDL

MySQL 5.6 provides robust online DDL capabilities, allowing index creation without blocking reads or writes. For MySQL 5.5, tools like pt‑osc can achieve similar results.

Solution Architecture

1. **Shadow Table**: Create a shadow table serverlog_read that mirrors changes from the source table.

2. **Materialized‑View Emulation**: MySQL lacks native materialized views, so use tools like FlexViews or PT‑OSC which create three triggers (INSERT, UPDATE, DELETE) to keep a view‑like table up‑to‑date.

3. **Auxiliary Tables**: serverlog_par_old – a partitioned table that stores refreshed data from the emulated materialized view. serverlog_host – holds incremental and real‑time data streams.

Data is categorized into:

Cold data (old, to be archived).

Incremental data (e.g., the last month’s records).

Real‑time data (continuously ingested).

By partitioning, old data can be moved to a separate partition and dropped instantly, while new data remains in the active partition.

Additional MySQL Advantages

1. **Table Structure Copy** – MySQL can duplicate a table’s definition efficiently: create table test1 like test; Alternatively, use SHOW CREATE TABLE or mysqldump --no-data to export the DDL.

2. **Data Copy** – Insert data from the original table into the new one with a single statement: insert into test1 select * from test; 3. **Backup & Archiving** – Instead of complex Oracle user‑rename procedures, MySQL can rename the database directory (or use RENAME USER ‑style tricks) to move old data to an archive location quickly and safely.

Conclusion

By leveraging MySQL’s online DDL, partition exchange, and trigger‑based materialized view techniques, the massive 200 million‑row table can be indexed, cleaned, and partitioned with minimal downtime, while still supporting real‑time analytics and continuous data ingestion.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

mysqlLarge TablesOnline DDLdata cleanupPartitioning
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.