Databases 12 min read

MySQL 8.0 Iterator Executor: Design, Implementation, and Future Outlook

The article explains MySQL 8.0’s new iterator‑based executor, detailing the underlying iterator model, its advantages over the legacy executor, the specific RowIterator classes introduced, supported query features, and future enhancements for more advanced algebraic operations.

Tencent Database Technology

Jul 10, 2019

MySQL 8.0 Iterator Executor: Design, Implementation, and Future Outlook

In a database management system the most important modules include the SQL optimizer, SQL executor, and transaction manager. The SQL processing pipeline is: input → parsing → semantic checking → logical optimization → physical optimization → execution. In MySQL 8.0 the executor has been substantially rewritten, moving closer to the classic iterator model, which we refer to as the iterator executor.

The iterator model is a common data‑traversal abstraction used when the size of a data source is unknown or too large to load entirely. It typically defines two interfaces: hasNext() to test for completion and next() to retrieve the next element.

Goetz Graefe extended this idea in the Volcano model, where each relational algebra operation is implemented as an iterator exposing an open() → next() → close() protocol. A query execution tree is transformed into an iterator tree; execution proceeds by the root iterator calling open(), then repeatedly invoking next() on child iterators until the leaf nodes produce rows, after which close() is called. This pull‑based data flow moves rows from leaves up to the root.

The iterator model offers several benefits:

Each iterator is independent, allowing simple composition of different query operators.

Uniform interfaces make extensions straightforward.

Rows flow between iterators with minimal resource usage, conserving memory.

It is easy to parallelize across processes or threads.

The goal of the MySQL 8.0 executor redesign is to create a generic C++ iterator API (named RowIterator) that can replace all existing record iterators and eventually supersede the legacy executor.

The legacy record iterators include:

QUICK_SELECT_I – abstracts index‑based access methods such as range scans and index merges.

READ_RECORD – abstracts QUICK_SELECT_I, full table scans, full index scans, and sort‑buffer results.

QEP_TAB – holds function pointers for READ_RECORD and various join access types (REF, EQ_REF, full‑text, etc.).

QEP_operation – abstracts temporary tables or join buffering (BNL, BKA).

QEP_TAB::next_select – abstracts nested‑loop joins, GROUP BY, and other operations.

Query_result – abstracts UNION, EXISTS short‑circuiting, result delivery, and other operations.

These interfaces are a mix of C++ classes and C function pointers, use both push and pull data models, and suffer from poor extensibility, inconsistent implementation, unclear abstraction layers, and high coupling, making future optimization difficult.

The new implementation introduces a universal C++ class RowIterator with the following members:

Constructor and destructor. Init(QEP_TAB*) – opens required resources and can be called repeatedly to reset the iterator. Read() – reads a row into the record buffer, similar to the old read_record(). UnlockRow() – releases locks on a row after it has been filtered, analogous to rr_unlock_row.

The execution flow with this generic iterator is illustrated in the following diagram:

MySQL 8.0.16 implements a variety of iterator types, including but not limited to:

TableScanIterator – sequential scan using ha_rnd_next.

IndexScanIterator – full index scan using ha_index_next or ha_index_prev.

IndexRangeScanIterator – range index scan wrapping QUICK_SELECT_I.

SortingIterator – sorts the output of another iterator.

SortBufferIterator – reads sorted results from a buffer (used by SortingIterator).

SortBufferIndirectIterator – reads row IDs from a buffer and fetches rows from the table.

SortFileIterator – reads sorted results from a file.

SortFileIndirectIterator – reads row IDs from a file and fetches rows from the table.

RefIterator – fetches rows from the right side of a join using a specific key.

RefOrNullIterator – fetches rows or NULL rows from the right side of a join.

EQRefIterator – uses a unique key to fetch rows from the right side of a join.

ConstIterator – reads a single row from a constant table.

FullTextSearchIterator – reads rows using a full‑text index.

DynamicRangeIterator – invokes a range optimizer per row and wraps QUICK_SELECT_I or a table scan as needed.

PushedJoinRefIterator – reads output pushed to NDB joins.

FilterIterator – filters rows to implement WHERE/HAVING.

LimitOffsetIterator – implements LIMIT/OFFSET.

AggregateIterator – implements aggregation and optional grouping.

NestedLoopIterator – performs nested‑loop joins (inner, outer, anti).

MaterializeIterator – materializes results into a temporary table before reading.

FakeSingleRowIterator – returns a single row for const‑table cases.

The new executor currently supports queries that involve primary and const tables, various joins (except semi‑joins), filtering, grouping (except roll‑up), LIMIT/OFFSET, and certain materializations. It does not yet support SELECT DISTINCT, CTEs, window functions, semi‑joins, loose scans, or BNL/BKA; unsupported queries fall back to the legacy executor. Both executors will coexist for a period before the old one is retired.

Future work based on the new executor includes richer algebraic query support, more precise EXPLAIN output in tree form, and better execution diagnostics. An example of the enhanced EXPLAIN format is shown below:

-> Aggregate: sum((lineitem.l_extendedprice * (1 - lineitem.l_discount)))
    -> Nested loop inner join
        -> Filter: (((part.p_brand = 'Brand#22') and (part.p_container ...
            -> Table scan: part
        -> Filter: (((lineitem.l_shipinstruct = 'DELIVER IN PERSON') and ...
            -> Reference lookup on lineitem using i_l_partkey (l_partkey=part.p_partkey)

We look forward to seeing these enhancements in action.

database Query Optimization C++MySQL Executor Iterator

Written by

Tencent Database Technology

Tencent's Database R&D team supports internal services such as WeChat Pay, WeChat Red Packets, Tencent Advertising, and Tencent Music, and provides external support on Tencent Cloud for TencentDB products like CynosDB, CDB, and TDSQL. This public account aims to promote and share professional database knowledge, growing together with database enthusiasts.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.