Databases 12 min read

Understanding MySQL GROUP BY Execution Strategies

This article explains why MySQL GROUP BY can dominate query cost, describes the four execution methods—index‑ordered, external‑sort (filesort), temporary‑table, and index‑skip‑scan for MIN/MAX—shows how filtering interacts with grouping, and offers practical tips such as using FORCE INDEX to obtain the optimal plan.

Aikesheng Open Source Community
Aikesheng Open Source Community
Aikesheng Open Source Community
Understanding MySQL GROUP BY Execution Strategies

In everyday queries the cost of a GROUP BY operation can far exceed that of index look‑ups; in many cases MySQL GROUP BY accounts for more than 90 % of execution time.

The main complexity of GROUP BY is evaluating the aggregate functions for each group. When source rows are not already grouped, MySQL must perform an extra step to collect values belonging to the same group.

1. Index‑Ordered GROUP BY

If the GROUP BY column has an index, MySQL can scan the index in order and compute aggregates on‑the‑fly, which is cheap. This works especially well with LIMIT or a covering index because the ordered index scan is very fast.

mysql> select k, count(*) c from tbl group by k order by k limit 5;
+---+---+
| k | c |
+---+---+
| 2 | 3 |
| 4 | 1 |
| 5 | 2 |
| 8 | 1 |
| 9 | 1 |
+---+---+
5 rows in set (0.00 sec)

When the index can be used, MySQL reports type: index and Extra: Using index in the EXPLAIN output.

2. External‑Sort (Filesort) GROUP BY

If no suitable index exists, MySQL falls back to an external sort (filesort). Adding the SQL_BIG_RESULT hint forces this plan.

mysql> explain select SQL_BIG_RESULT g, count(*) c from tbl group by g limit 5 \G
type: ALL
possible_keys: NULL
Extra: Using filesort

This method scans the whole table, sorts the rows by the GROUP BY column, and then aggregates.

3. Temporary‑Table GROUP BY

When the optimizer cannot use an index or filesort efficiently, it creates a temporary table that stores one row per group and updates it for each incoming row.

mysql> explain select g, sum(g) s from tbl group by g limit 5 \G
type: ALL
Extra: Using temporary

If the temporary table grows large, the plan becomes I/O‑heavy; in such cases the external‑sort plan is usually faster.

4. Index‑Skip‑Scan for MIN()/MAX()

For the special aggregates MIN() and MAX() , MySQL can use a “skip‑scan” if an index on the grouped column exists. The optimizer can jump directly to the smallest or largest indexed value without scanning the whole group.

mysql> select k, max(id) from tbl group by k \G
type: range
Extra: Using index for group-by

This optimization is only chosen when each group contains many rows; otherwise the index‑ordered method is preferred.

Filtering and Grouping

The same concepts apply when a WHERE clause is present. MySQL may use an index to filter rows first and then apply a temporary table for grouping, or it may use the index for grouping and discard the WHERE condition during the scan.

mysql> explain select g, sum(g) s from tbl where k>4 group by g limit 5 \G
type: range
key: k
Extra: Using index condition; Using temporary; Using filesort

Depending on the constants in the WHERE clause, the optimizer may choose either an index‑ordered scan (discarding the WHERE ) or an index‑based filter with a temporary table.

Practical Tips

Because MySQL does not always pick the optimal plan, you may need to hint the optimizer, e.g., SQL_BIG_RESULT , SQL_SMALL_RESULT , or FORCE INDEX , to enforce the desired execution strategy.

Overall, the four GROUP BY methods are:

Index‑ordered GROUP BY (fast when an index on the grouping column exists)

External‑sort (filesort) GROUP BY (used when no suitable index)

Temporary‑table GROUP BY (fallback for large result sets)

Index‑skip‑scan for MIN()/MAX() (special case for these aggregates)

Choosing the right method can dramatically reduce query latency, especially on large tables.

Query OptimizationMySQLDatabase IndexesGROUP BYtemporary tables
Aikesheng Open Source Community
Written by

Aikesheng Open Source Community

The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.