How to Detect and Eliminate MySQL Table Fragmentation for Better Performance
Learn why MySQL tables develop fragmentation, how to measure it using Data_free in INFORMATION_SCHEMA, and the appropriate commands—OPTIMIZE for MyISAM and ALTER ENGINE for InnoDB—to safely reclaim space, with practical scheduling tips to minimize impact on production workloads.
Causes of Fragmentation
(1) Table storage can become fragmented; when rows are deleted, the freed space remains empty, and repeated deletions can make the total empty space larger than the space occupied by existing rows.
(2) During insert operations, MySQL tries to reuse empty space, but if a gap never receives suitably sized data, it remains unused, creating fragmentation.
(3) When MySQL scans data, it scans up to the table's capacity limit, i.e., the peak region of the allocated space.
Example:
A table has 10,000 rows, each 10 bytes, occupying 100,000 bytes. After deleting rows and leaving only one row (10 bytes), MySQL still treats the table as 100,000 bytes during reads, so increasing fragmentation degrades query performance.Viewing Table Fragmentation Size
(1) Check the fragmentation size of a specific table: mysql> SHOW TABLE STATUS LIKE 'table_name'; The Data_free column in the result shows the amount of free space (fragmentation).
(2) List all tables that have generated fragmentation:
mysql> SELECT table_schema db, table_name, data_free, engine
FROM information_schema.tables
WHERE table_schema NOT IN ('information_schema','mysql')
AND data_free > 0;Removing Table Fragmentation
(1) For MyISAM tables: mysql> OPTIMIZE TABLE table_name; (2) For InnoDB tables: mysql> ALTER TABLE table_name ENGINE=InnoDB; Because the storage engines differ, the OPTIMIZE operation behaves differently. MyISAM stores indexes separately from data, so OPTIMIZE can reorganize the data file and rebuild indexes. The OPTIMIZE command locks the table temporarily; larger tables take longer, and it is not suitable to run directly from application code.
A better approach is to create a shell script that periodically checks the
information_schema TABLEStable's DATA_FREE column. When the value exceeds a chosen threshold, run the appropriate optimization command.
Recommendation
Since optimization locks the table, schedule the cleanup during low‑traffic periods (e.g., early Wednesday morning). Automate the process: regularly query DATA_FREE, compare it to a warning level, and run OPTIMIZE or ALTER ENGINE when needed.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java Backend Technology
Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
