Why MySQL Delete Doesn’t Free Space and How InnoDB Reclaims It
This article explains why MySQL’s DELETE only sets a delete‑mark, how InnoDB’s MVCC and purge thread reuse freed pages, the impact of B+‑tree storage on I/O, page merge and split mechanisms, and the proper way to rebuild tables to recover space.
1. Deleting Is Not Real Deletion
In MySQL InnoDB, executing DELETE does not physically remove rows; it only sets a delete‑mark flag (deleteMark) on the record. The row remains on disk, so space is not immediately released, which often leads to the confusion “I deleted data but the storage size didn’t shrink.”
15M 7 6 18:46 user_info.ibd # before delete</code>
<code>15M 10 4 16:47 user_info.ibd # after delete2. Why Use a Delete Mark?
InnoDB supports MVCC (multi‑version concurrency control). Each update writes the previous version to an undo log, allowing transactions to roll back or read a consistent snapshot without locking. The original row and its undo log are linked by a pointer, so the engine can locate the old version if needed.
3. Space Reuse via the Purge Thread
InnoDB runs a background purge thread that scans rows with deleteMark. When such rows are no longer referenced by any active transaction, the thread marks their pages as reusable. Because leaf pages are ordered, new inserts that fall into the same page can reuse the freed space, reducing the need for new page allocations.
4. Page‑Based Storage and I/O
Data is stored on disk in pages (typically 16 KB). Random I/O—accessing non‑adjacent sectors—is slower than sequential I/O. InnoDB uses a B+‑tree index; the tree height roughly equals the number of I/O operations needed to reach a leaf. Because the root (and often the second level) fits in memory, actual I/O is often less than the theoretical height.
For a BIGINT primary key (8 bytes) plus a 6‑byte pointer, a leaf can hold about 1170 entries (16 KB / (8+6)). This means the second level of a three‑level tree occupies roughly 18 MB, which usually stays in memory, further reducing I/O.
5. Page Merge to Reclaim Fragmented Space
When many rows are deleted, pages become fragmented. InnoDB’s merge operation looks at adjacent pages; if both contain a large amount of reusable space (default MERGE_THRESHOLD = 50 %), it moves data from one page to the other and frees the empty page for future use.
If the neighboring page does not have enough free space, merging would cost more (data movement) than the benefit, so the threshold prevents unnecessary merges.
6. Page Split When No Space
If a leaf page is full and its neighbor also lacks space, InnoDB must split the page: it creates a new page, moves a portion of the records from the original page, and updates the linked list of leaf pages. This operation reduces page utilization and incurs extra I/O and locking.
Typical causes of splits include:
Highly scattered inserts that break data continuity.
Updating a row to a larger size that no longer fits.
Both merges and splits are relatively expensive because they involve moving data and acquiring index‑tree locks.
7. Manually Rebuilding a Table
When a table accumulates many fragmented pages, a common remedy is to rebuild the table. The simplest command is: alter table xx engine=InnoDB This creates a new copy of the table (using a temporary file), copies all rows via the primary‑key index, logs changes to a row‑log, applies those changes to the temporary file, and finally swaps the temporary file with the original.
The process requires at least double the current disk space and should be run during low‑traffic periods because, even with online DDL, it consumes CPU and I/O.
8. Rebuild May Not Shrink Space
InnoDB deliberately leaves about 1/16 of each page free after a rebuild to accommodate future row growth without immediate page splits. Consequently, a rebuild can sometimes increase the total table size, especially if the original table was already tightly packed.
Two scenarios where space grows after a rebuild:
The table is already compact; the reserved 1/16 adds extra pages.
After the first rebuild, the reserved space is partially used; a second rebuild adds another 1/16 reservation, leading to further growth.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
