How to Remove Duplicate Rows in SQL: DISTINCT, GROUP BY, and ROW_NUMBER Explained
This article demonstrates three SQL techniques—DISTINCT, GROUP BY, and the ROW_NUMBER window function—for deduplicating records and counting unique tasks, comparing their syntax, performance, and behavior across MySQL, Hive, and Oracle environments.
Using DISTINCT for Deduplication
In MySQL, SELECT DISTINCT column returns unique values; combined with COUNT(DISTINCT column) it can compute the number of distinct tasks. However, DISTINCT can be less efficient and cannot display the deduplicated rows directly when multiple columns are selected.
-- List all unique task_id values
SELECT DISTINCT task_id FROM Task;
-- Count distinct task_id
SELECT COUNT(DISTINCT task_id) AS task_num FROM Task;Using GROUP BY for Deduplication
GROUP BY groups rows by the specified columns; selecting the grouped column(s) yields distinct values, and counting them requires a subquery.
-- List unique task_id (including NULL)
SELECT task_id FROM Task GROUP BY task_id;
-- Count distinct task_id
SELECT COUNT(task_id) AS task_num FROM (
SELECT task_id FROM Task GROUP BY task_id
) tmp;Using ROW_NUMBER Window Function
When the SQL engine supports window functions (e.g., Hive, Oracle), ROW_NUMBER() can assign a sequential number within each partition. Keeping only rows where the row number equals 1 effectively deduplicates the dataset.
SELECT COUNT(CASE WHEN rn = 1 THEN task_id ELSE NULL END) AS task_num
FROM (
SELECT task_id,
ROW_NUMBER() OVER (PARTITION BY task_id ORDER BY start_time) AS rn
FROM Task
) tmp;Additional examples with a test table illustrate how DISTINCT and GROUP BY behave when selecting one or multiple columns.
SELECT DISTINCT user_id FROM Test; -- returns 1, 2
SELECT DISTINCT user_id, user_type FROM Test; -- returns 1,1; 1,2; 2,1
SELECT user_id FROM Test GROUP BY user_id; -- returns 1, 2
SELECT user_id, user_type FROM Test GROUP BY user_id, user_type; -- returns 1,1; 1,2; 2,1Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java Architect Essentials
Committed to sharing quality articles and tutorials to help Java programmers progress from junior to mid-level to senior architect. We curate high-quality learning resources, interview questions, videos, and projects from across the internet to help you systematically improve your Java architecture skills. Follow and reply '1024' to get Java programming resources. Learn together, grow together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
