When to Use DISTINCT vs GROUP BY in MySQL: Performance Insights
This article explains how DISTINCT and GROUP BY differ in MySQL, comparing their index usage, execution speed with and without indexes, the impact of implicit sorting in older versions, and why GROUP BY is often preferred for clearer semantics and advanced aggregations.
In MySQL, when the query semantics are the same and an index is available, both GROUP BY and DISTINCT can use the index and have comparable performance; without an index, DISTINCT is usually faster because GROUP BY may trigger an implicit sort (filesort).
DISTINCT Usage
The basic syntax is:
SELECT DISTINCT columns FROM table_name WHERE where_conditions;Example:
mysql> SELECT DISTINCT age FROM student;
+------+
| age |
+------+
| 10 |
| 12 |
| 11 |
| NULL |
+------+
4 rows in set (0.01 sec) DISTINCTreturns unique values; when a column contains NULL, MySQL keeps a single NULL because all NULL values are considered equal.
For multiple columns, DISTINCT removes rows only when all specified columns are identical:
SELECT DISTINCT column1, column2 FROM table_name WHERE where_conditions;
mysql> SELECT DISTINCT sex, age FROM student;
+--------+------+
| sex | age |
+--------+------+
| male | 10 |
| female | 12 |
| male | 11 |
| male | NULL |
| female | 11 |
+--------+------+
5 rows in set (0.02 sec)GROUP BY Usage
Single‑column grouping:
SELECT columns FROM table_name WHERE where_conditions GROUP BY columns; mysql> SELECT age FROM student GROUP BY age;
+------+
| age |
+------+
| 10 |
| 12 |
| 11 |
| NULL |
+------+
4 rows in set (0.02 sec)Multi‑column grouping:
SELECT columns FROM table_name WHERE where_conditions GROUP BY column1, column2; mysql> SELECT sex, age FROM student GROUP BY sex, age;
+--------+------+
| sex | age |
+--------+------+
| male | 10 |
| female | 12 |
| male | 11 |
| male | NULL |
| female | 11 |
+--------+------+
5 rows in set (0.03 sec)The difference is that GROUP BY first groups rows and then returns the first row of each group, which may involve an implicit sort on the grouping columns.
Implicit Sorting and MySQL Versions
In MySQL 5.7 and earlier, GROUP BY implicitly sorts the result set when no ASC / DESC is specified, which can cause a filesort and degrade performance, especially when the optimizer cannot use an index for the sort.
GROUP BY implicitly sorts by default (that is, in the absence of ASC or DESC designators for GROUP BY columns). Relying on this implicit sorting is deprecated; use an explicit ORDER BY clause to guarantee order.
MySQL 8.0 removed this implicit sorting, so GROUP BY no longer triggers a filesort in the same situations, making its performance comparable to DISTINCT even without an index.
Practical Recommendations
Use GROUP BY when you need clear grouping semantics, the ability to apply HAVING filters, or aggregate functions.
Use DISTINCT for a simple deduplication of all selected columns.
When an appropriate index exists, both constructs are equally efficient.
When no index is available, DISTINCT may be faster on MySQL versions prior to 8.0 due to the removal of implicit sorting in later versions.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java High-Performance Architecture
Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
