Exploring the Implementation of the DELETE Statement in MySQL Source Code
This article details a two‑month journey of reading MySQL 8.0.20 source code to uncover how the DELETE statement performs mark‑delete, describing the experimental setup, trace‑log analysis, query and delete phases, and the exact functions that set the deletion flag in InnoDB.
The author spent two months studying MySQL source code and writes this article to share the methodology and pitfalls encountered while exploring how the DELETE statement is implemented, aiming to help others investigate MySQL internals more easily.
Guided by the question "How is mark‑delete implemented for DELETE?", the article outlines a four‑step learning process and lists the goals: using mysql.trace logs, locating where the delete mark is set, and fully understanding the server‑side and InnoDB execution flow.
Experimental environment: MySQL 8.0.20, InnoDB tables, a test table t_del with three columns, sample data inserted, and the server started with the --debug flag so that trace logs are written to /tmp/mysqld.trace.
Execution overview is obtained from the trace log and a hand‑drawn diagram. Key methods such as dispatch_command, mysql_parse, and especially row_search_mvcc are identified as the entry points for locating records.
Query phase : The core routine is row_search_mvcc, which calls btr_pcur_open_with_no_init → btr_cur_search_to_nth_level → buf_page_get_gen → page_cur_search_with_match_bytes. This chain loads the appropriate page from the buffer pool or disk and searches within the page to find the target record.
Delete phase : After the record is located, the handler chain is invoked: handler::ha_delete_row → ha_innobase::delete_row → row_update_for_mysql_using_upd_graph. The deletion is split into two steps:
Primary‑key (clustered) index deletion via row_upd_clust_step.
Secondary‑index deletion via a loop calling row_upd_sec_step.
For the primary key, a mini‑transaction is started, the record header is fetched, and the delete flag (bit 37) is set using rec_set_deleted_flag_new. Undo and redo logs are written through trx_undo_report_row_operation and btr_cur_del_mark_set_clust_rec_log.
For secondary indexes, the process is similar but skips the undo log; the flag is set with btr_rec_set_deleted_flag and the redo log is written via btr_cur_del_mark_set_sec_rec_log.
The article concludes with several take‑aways: the trace log is a powerful debugging tool, DELETE consists of a query stage and a mark‑delete stage, the function btr_cur_search_to_nth_level is central to many DML operations, InnoDB stores each record header in 5 bytes (40 bits) with the delete flag at the 37th bit, and understanding this low‑level structure simplifies further source‑code exploration.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Xueersi Online School Tech Team
The Xueersi Online School Tech Team, dedicated to innovating and promoting internet education technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
