When to Use DISTINCT vs GROUP BY in MySQL: Performance Insights
This article compares MySQL's DISTINCT and GROUP BY clauses, explaining how index availability, implicit sorting, and MySQL version differences affect their execution speed, and provides practical guidance on choosing the right clause for optimal query performance.
Conclusion
When the semantics are identical and an index exists, GROUP BY and DISTINCT can both use the index and have the same efficiency. When the semantics are identical but no index is available, DISTINCT is faster than GROUP BY because GROUP BY may trigger a filesort operation.
DISTINCT Usage
Basic syntax:
SELECT DISTINCT columns FROM table_name WHERE where_conditions;Example:
mysql> select distinct age from student;
+------+
| age |
+------+
| 10 |
| 12 |
| 11 |
| NULL |
+------+
4 rows in set (0.01 sec)The DISTINCT keyword returns unique values. If a column contains NULL, MySQL keeps one NULL and removes the others because all NULL values are considered equal.
Multi‑column DISTINCT
SELECT DISTINCT column1, column2 FROM table_name WHERE where_conditions; mysql> select distinct sex,age from student;
+--------+------+
| sex | age |
+--------+------+
| male | 10 |
| female | 12 |
| male | 11 |
| male | NULL |
| female | 11 |
+--------+------+
5 rows in set (0.02 sec)GROUP BY Usage
Single‑column GROUP BY
SELECT columns FROM table_name WHERE where_conditions GROUP BY columns; mysql> select age from student group by age;
+------+
| age |
+------+
| 10 |
| 12 |
| 11 |
| NULL |
+------+
4 rows in set (0.02 sec)Multi‑column GROUP BY
SELECT columns FROM table_name WHERE where_conditions GROUP BY column1, column2; mysql> select sex,age from student group by sex,age;
+--------+------+
| sex | age |
+--------+------+
| male | 10 |
| female | 12 |
| male | 11 |
| male | NULL |
| female | 11 |
+--------+------+
5 rows in set (0.03 sec)Difference Between DISTINCT and GROUP BY
Both clauses are based on grouping operations and can use indexes (e.g., "Using index for group‑by"). In most cases they are interchangeable for the same semantics.
However, before MySQL 8.0, GROUP BY performed an implicit sort when no index could satisfy the ordering, which caused a filesort and reduced performance. DISTINCT does not trigger this extra sort, so it can be faster on tables without suitable indexes.
From MySQL 8.0 onward, implicit sorting by GROUP BY was removed, making its performance comparable to DISTINCT when no index is present.
Implicit Sorting
GROUP BY implicitly sorts by default (that is, in the absence of ASC or DESC designators for GROUP BY columns). However, relying on implicit GROUP BY sorting or explicit sorting for GROUP BY is deprecated. To produce a given sort order, provide an ORDER BY clause.
Because implicit sorting can cause temporary tables and filesort, MySQL 8.0 eliminated this behavior, simplifying optimization.
Why Prefer GROUP BY
GROUP BYexpresses the intent of grouping more clearly. GROUP BY allows more complex data processing, such as using HAVING filters or aggregate functions.
While DISTINCT applies to all selected columns, GROUP BY offers finer control and flexibility for advanced queries.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java Backend Technology
Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
