Where Does GROUP BY Fall Short? Using Window Functions to Bridge Advanced Data Analysis Gaps
This article reviews MySQL's GROUP BY fundamentals, highlights its limitations and execution order, then demonstrates how window functions—including ROW_NUMBER, RANK, DENSE_RANK, cumulative totals, moving averages, and TOP N techniques—can overcome those gaps for richer analytical queries.
GROUP BY Basics and HAVING
GROUP BY groups rows and applies aggregate functions such as SUM, COUNT, and AVG. Basic syntax:
SELECT col1, col2, AGG_FUNC(col3)
FROM table_name
GROUP BY col1, col2;Multi‑level grouping can compute department‑wise statistics, e.g., employee count, average salary, and total salary, ordered by average salary descending.
HAVING for Post‑Group Filtering
HAVING filters grouped results after aggregation. Example:
SELECT department,
COUNT(*) AS employee_count,
AVG(salary) AS avg_salary,
MIN(salary) AS min_salary,
MAX(salary) AS max_salary
FROM employees
WHERE hire_date > '2020-01-01'
GROUP BY department
HAVING avg_salary > 5000
AND employee_count > 3
AND MAX(salary) < 20000
ORDER BY avg_salary DESC;Execution order: WHERE → GROUP BY → aggregate functions → HAVING.
Window Functions Overview
Window functions compute over a set of rows without collapsing them, preserving the original row count. Basic syntax:
function_name() OVER (
[PARTITION BY col1, col2, ...]
[ORDER BY col1 [ASC|DESC], col2, ...]
[frame_clause]
);Ranking Functions Comparison
ROW_NUMBER()
Assigns a unique sequential number within each partition, ordered by a specified column.
SELECT employee_id, name, department, salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS salary_rank
FROM employees;RANK()
Gives the same rank to ties and skips subsequent ranks.
SELECT employee_id, name, department, salary,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS salary_rank
FROM employees;DENSE_RANK()
Gives the same rank to ties but does not skip ranks.
SELECT employee_id, name, department, salary,
DENSE_RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS salary_rank
FROM employees;Comparative Example
SELECT employee_id, name, salary,
ROW_NUMBER() OVER (ORDER BY salary DESC) AS row_num,
RANK() OVER (ORDER BY salary DESC) AS rank_num,
DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank_num
FROM employees
WHERE department = 'IT'
ORDER BY salary DESC;Result shows how the three functions differ for identical salary values.
Cumulative Calculations and Moving Averages
Running Total
SELECT salesperson_id, sale_month, monthly_amount,
SUM(monthly_amount) OVER (
PARTITION BY salesperson_id
ORDER BY sale_month
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS cumulative_amount
FROM monthly_sales
ORDER BY salesperson_id, sale_month;The window frame accumulates sales from the first month up to the current row.
7‑Day Moving Average
SELECT sale_date, daily_amount,
AVG(daily_amount) OVER (
ORDER BY sale_date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS moving_avg_7days,
AVG(daily_amount) OVER (
ORDER BY sale_date
RANGE BETWEEN INTERVAL 6 DAY PRECEDING AND CURRENT ROW
) AS moving_avg_7days_range
FROM daily_sales
ORDER BY sale_date; ROWScounts rows; RANGE uses actual date intervals.
Window Frame Variants
SELECT sale_date, daily_amount,
AVG(daily_amount) OVER (ORDER BY sale_date ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS avg_3days,
SUM(daily_amount) OVER (ORDER BY sale_date ROWS UNBOUNDED PRECEDING) AS cumulative_total,
MIN(daily_amount) OVER (ORDER BY sale_date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS min_last_3days
FROM daily_sales;Solving TOP N Within Groups
ROW_NUMBER for Strict TOP N
WITH ranked_employees AS (
SELECT employee_id, name, department, salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS salary_rank
FROM employees
)
SELECT *
FROM ranked_employees
WHERE salary_rank <= 3
ORDER BY department, salary_rank;DENSE_RANK for Ties
WITH ranked_employees AS (
SELECT employee_id, name, department, salary,
DENSE_RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS salary_rank
FROM employees
)
SELECT *
FROM ranked_employees
WHERE salary_rank <= 3
ORDER BY department, salary_rank, employee_id;Complex TOP N: Monthly Top‑3 Products
WITH monthly_product_sales AS (
SELECT DATE_FORMAT(order_date, '%Y-%m') AS sale_month,
product_id, product_name,
SUM(amount) AS monthly_amount,
COUNT(DISTINCT order_id) AS order_count,
ROW_NUMBER() OVER (
PARTITION BY DATE_FORMAT(order_date, '%Y-%m')
ORDER BY SUM(amount) DESC
) AS monthly_rank
FROM sales
WHERE order_date >= '2023-01-01'
GROUP BY DATE_FORMAT(order_date, '%Y-%m'), product_id, product_name
)
SELECT sale_month, product_id, product_name, monthly_amount, order_count, monthly_rank
FROM monthly_product_sales
WHERE monthly_rank <= 3
ORDER BY sale_month DESC, monthly_rank;Advanced: Highest and Lowest Salaries per Department
WITH extreme_salaries AS (
SELECT employee_id, name, department, salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS highest_rank,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary ASC) AS lowest_rank
FROM employees
WHERE salary IS NOT NULL
)
SELECT employee_id, name, department, salary,
CASE WHEN highest_rank <= 2 THEN '最高工资'
WHEN lowest_rank <= 2 THEN '最低工资'
END AS salary_type
FROM extreme_salaries
WHERE highest_rank <= 2 OR lowest_rank <= 2
ORDER BY department, salary_type, salary DESC;Key Takeaways
Execution Order : FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY.
When to Use GROUP BY vs. Window Functions : GROUP BY reduces rows; window functions keep the original row count for richer analysis.
Choosing Ranking Functions : ROW_NUMBER for unique sequencing, RANK for gaps on ties, DENSE_RANK for continuous ties.
Performance Consideration : Use appropriate indexes to avoid full‑table scans.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Senior Xiao Ying
Dedicated to sharing Java backend technical experience and original tutorials, offering career transition advice and resume editing. Recognized as a rising star in CSDN's Java backend community and ranked Top 3 in the 2022 New Star Program for Java backend.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
