When to Use DISTINCT vs GROUP BY in MySQL: Performance Insights
Both DISTINCT and GROUP BY can use indexes when available, giving similar performance, but without indexes DISTINCT is usually faster because GROUP BY may invoke implicit sorting; MySQL 8.0 removed this sorting, making their performance nearly equal, and GROUP BY offers clearer semantics and more flexibility for complex queries.
Distinct and Group By Overview
Both DISTINCT and GROUP BY can be used to eliminate duplicate rows. DISTINCT applies to all selected columns, while GROUP BY groups by the listed columns and can be combined with HAVING, aggregate functions, and other post‑group processing.
Basic Syntax
SELECT DISTINCT col1, col2 FROM tbl WHERE condition; SELECT col1, col2 FROM tbl WHERE condition GROUP BY col1, col2;Examples
mysql> SELECT DISTINCT age FROM student;
+------+
| age |
+------+
| 10 |
| 12 |
| 11 |
| NULL |
+------+
mysql> SELECT age FROM student GROUP BY age;
+------+
| age |
+------+
| 10 |
| 12 |
| 11 |
| NULL |
+------+Index Interaction
When an appropriate index covers the columns used in DISTINCT or GROUP BY , MySQL can perform an index‑only scan (EXPLAIN shows Using index for group‑by). In that case the execution cost is essentially the same.
EXPLAIN SELECT col FROM tbl GROUP BY col;
-- Extra: Using index for group‑by
EXPLAIN SELECT DISTINCT col FROM tbl;
-- Extra: Using index for group‑byIf no usable index exists, MySQL 5.7 and earlier will sort the rows for GROUP BY (implicit filesort), which may require a temporary table. DISTINCT does not trigger this extra sort, so it is usually faster.
Implicit Sorting (MySQL 5.7 and earlier)
GROUP BY implicitly sorts the result set unless an ORDER BY clause is present. This can cause:
Creation of a temporary table.
Filesort operation (shown as Using temporary; Using filesort in EXPLAIN).
Potential disk spill when the temporary table exceeds the in‑memory limit.
EXPLAIN SELECT col FROM tbl GROUP BY col;
-- Extra: Using temporary; Using filesortReference: https://dev.mysql.com/doc/refman/5.7/en/order-by-optimization.html
Behavior in MySQL 8.0
MySQL 8.0 removed the implicit sorting for GROUP BY . When no index can be used, both DISTINCT and GROUP BY have nearly identical performance.
Reference: https://dev.mysql.com/doc/refman/8.0/en/order-by-optimization.html
When to Prefer GROUP BY
Clearer intent when the goal is to group rows rather than merely deduplicate.
Allows post‑group processing such as HAVING filters, aggregate functions ( SUM, COUNT, etc.), and expressions that reference grouped columns.
Works on a subset of columns; DISTINCT always applies to all selected columns.
Performance Summary
With a suitable index, GROUP BY and DISTINCT have comparable execution time because both can leverage the index.
Without an index, DISTINCT is typically faster on MySQL 5.7 because GROUP BY may incur an implicit sort (filesort) and temporary table.
On MySQL 8.0 the implicit sort was removed, so the performance gap disappears; both constructs behave similarly when no index is available.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
