Understanding MySQL GROUP BY Execution Strategies
This article explains why MySQL GROUP BY can dominate query cost, describes the four execution methods—index‑ordered, external‑sort (filesort), temporary‑table, and index‑skip‑scan for MIN/MAX—shows how filtering interacts with grouping, and offers practical tips such as using FORCE INDEX to obtain the optimal plan.
In everyday queries the cost of a GROUP BY operation can far exceed that of index look‑ups; in many cases MySQL GROUP BY accounts for more than 90 % of execution time.
The main complexity of GROUP BY is evaluating the aggregate functions for each group. When source rows are not already grouped, MySQL must perform an extra step to collect values belonging to the same group.
1. Index‑Ordered GROUP BY
If the GROUP BY column has an index, MySQL can scan the index in order and compute aggregates on‑the‑fly, which is cheap. This works especially well with LIMIT or a covering index because the ordered index scan is very fast.
mysql> select k, count(*) c from tbl group by k order by k limit 5; +---+---+
| k | c |
+---+---+
| 2 | 3 |
| 4 | 1 |
| 5 | 2 |
| 8 | 1 |
| 9 | 1 |
+---+---+
5 rows in set (0.00 sec)When the index can be used, MySQL reports type: index and Extra: Using index in the EXPLAIN output.
2. External‑Sort (Filesort) GROUP BY
If no suitable index exists, MySQL falls back to an external sort (filesort). Adding the SQL_BIG_RESULT hint forces this plan.
mysql> explain select SQL_BIG_RESULT g, count(*) c from tbl group by g limit 5 \G type: ALL
possible_keys: NULL
Extra: Using filesortThis method scans the whole table, sorts the rows by the GROUP BY column, and then aggregates.
3. Temporary‑Table GROUP BY
When the optimizer cannot use an index or filesort efficiently, it creates a temporary table that stores one row per group and updates it for each incoming row.
mysql> explain select g, sum(g) s from tbl group by g limit 5 \G type: ALL
Extra: Using temporaryIf the temporary table grows large, the plan becomes I/O‑heavy; in such cases the external‑sort plan is usually faster.
4. Index‑Skip‑Scan for MIN()/MAX()
For the special aggregates MIN() and MAX() , MySQL can use a “skip‑scan” if an index on the grouped column exists. The optimizer can jump directly to the smallest or largest indexed value without scanning the whole group.
mysql> select k, max(id) from tbl group by k \G type: range
Extra: Using index for group-byThis optimization is only chosen when each group contains many rows; otherwise the index‑ordered method is preferred.
Filtering and Grouping
The same concepts apply when a WHERE clause is present. MySQL may use an index to filter rows first and then apply a temporary table for grouping, or it may use the index for grouping and discard the WHERE condition during the scan.
mysql> explain select g, sum(g) s from tbl where k>4 group by g limit 5 \G type: range
key: k
Extra: Using index condition; Using temporary; Using filesortDepending on the constants in the WHERE clause, the optimizer may choose either an index‑ordered scan (discarding the WHERE ) or an index‑based filter with a temporary table.
Practical Tips
Because MySQL does not always pick the optimal plan, you may need to hint the optimizer, e.g., SQL_BIG_RESULT , SQL_SMALL_RESULT , or FORCE INDEX , to enforce the desired execution strategy.
Overall, the four GROUP BY methods are:
Index‑ordered GROUP BY (fast when an index on the grouping column exists)
External‑sort (filesort) GROUP BY (used when no suitable index)
Temporary‑table GROUP BY (fallback for large result sets)
Index‑skip‑scan for MIN()/MAX() (special case for these aggregates)
Choosing the right method can dramatically reduce query latency, especially on large tables.
Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.