Using MySQL Window Functions for Ranking, Aggregation, and Data Analysis
This article explains how MySQL 8.x window functions such as OVER, PARTITION BY, and ORDER BY can simplify complex ranking and aggregation queries, demonstrates creating a sample scores table, and provides practical examples of functions like ROW_NUMBER, RANK, DENSE_RANK, NTILE, LAG, and LEAD with their results.
First, a complex SQL query that calculates a score ranking is shown, which is hard to read and maintain.
SELECT
s1.name,
s1.subject,
s1.score,
sub.avg_score AS average_score_per_subject,
(SELECT COUNT(DISTINCT s2.score) + 1 FROM scores s2 WHERE s2.score > s1.score) AS score_rank
FROM scores s1
JOIN (
SELECT subject, AVG(score) AS avg_score
FROM scores
GROUP BY subject
) sub ON s1.subject = sub.subject
ORDER BY s1.score DESC;The purpose of this query is to compute a ranking for each student's score, which is unnecessarily complicated.
A simplified version using window functions is presented:
SELECT
name,
subject,
score,
AVG(score) OVER (PARTITION BY subject) AS average_score_per_subject,
RANK() OVER (ORDER BY score DESC) AS score_rank
FROM scores
ORDER BY score DESC;This version is much clearer.
What are Window Functions
MySQL 8.x introduces window functions, which perform calculations over a defined window of rows in the result set. Before MySQL 8.0, similar tasks required nested subqueries and joins, leading to unreadable "SQL as a piece of ancient script".
Window functions are mainly used for statistical and analytical calculations, offering better performance than traditional subqueries.
OVER() Clause
The OVER() clause defines the window for a function. It must be combined with a function such as SUM, AVG, ROW_NUMBER, etc., and specifies the range and ordering of rows.
function_name(...) OVER (
[PARTITION BY expr_list]
[ORDER BY expr_list]
[range]
)PARTITION BY
Specifies the column(s) used to partition the data. Each partition is processed independently.
ORDER BY
Orders rows within each partition, which can be combined with range or rows windows.
Range and Row Windows
Define the type of window: RANGE uses logical value ranges, ROWS uses physical row offsets.
Range Window Example
SUM(salary) OVER (ORDER BY id
RANGE BETWEEN 5 PRECEDING AND 5 FOLLOWING)This calculates the sum of salary for the current row plus the five preceding and five following rows.
Row Window Example
SUM(salary) OVER (ORDER BY id
ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING)This calculates the sum of salary for the current row plus the two preceding and two following rows.
Combining with Aggregate Functions
1. Partition by subject and get the maximum score per subject:
SELECT subject,score, MAX(score) OVER (PARTITION BY subject) as `subject_max_score`
FROM scores;Result:
subject
score
subject_max_score
化学
75
75
化学
58
75
数学
68
90
数学
90
90
数学
45
90
物理
87
87
物理
79
87
生物
92
92
英语
91
91
2. Count the number of students per subject:
SELECT subject,score, count(name) OVER (PARTITION BY subject) as `subject_participants`
FROM scores;Result:
subject
score
subject_participants
化学
75
2
化学
58
2
数学
68
3
数学
90
3
数学
45
3
物理
87
2
物理
79
2
生物
92
1
英语
91
1
3. Total score per subject:
SELECT subject, SUM(score) OVER (PARTITION BY subject) as `subject_total`
FROM scores;Result:
subject
subject_total
化学
133
数学
203
物理
166
生物
92
英语
91
4. Cumulative sum ordered by score:
SELECT name,subject,score, SUM(score) OVER (order BY score) as `cumulative_score`
FROM scores;Result (partial):
name
subject
score
cumulative_score
Student9
数学
45
45
Student6
化学
58
103
Student4
数学
68
171
5. Cumulative sum with a sliding window (current row plus one preceding and one following row):
SELECT name,subject,score, SUM(score) OVER (order BY `score` ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) as `cumulative_score`
FROM scores;Result (partial):
name
subject
score
cumulative_score
Student9
数学
45
103
Student6
化学
58
171
Student4
数学
68
201
Combining with Ranking Functions
ROW_NUMBER()
Assigns a unique sequential number to each row based on the specified order.
SELECT name,subject,score, ROW_NUMBER() OVER (order BY `score` desc) as `rank`
FROM scores;Result (partial):
name
subject
score
rank
Student2
生物
92
1
Student5
英语
91
2
Student8
数学
90
3
In MySQL 5.7, achieving the same result requires a self‑join and GROUP BY, which is far more complex.
RANK()
Provides ranking with gaps when there are ties (e.g., two rows share rank 1, the next rank is 3).
SELECT name,subject,score, RANK() OVER (order BY `score` desc) as `rank`
FROM scores;Result (partial):
name
subject
score
rank
Student1
化学
92
1
Student2
生物
92
1
Student5
英语
91
3
DENSE_RANK()
Similar to RANK() but without gaps; after a tie, the next rank increments by 1.
SELECT name,subject,score, DENSE_RANK() OVER (order BY `score` desc) as `rank`
FROM scores;Result (partial):
name
subject
score
rank
Student1
化学
92
1
Student2
生物
92
1
Student5
英语
91
2
NTILE()
Divides the result set into a specified number of buckets and assigns a bucket number to each row.
SELECT name,subject,score, NTILE(4) OVER (order BY `score` desc) as `group`
FROM scores;Result (partial):
name
subject
score
group
Student1
化学
92
1
Student2
生物
92
1
Student5
英语
91
1
Student8
数学
90
2
LAG()
Accesses data from a preceding row within the same result set, useful for comparing adjacent rows.
SELECT name,subject,score,
ABS(score - LAG(score, 1, score) OVER (ORDER BY score DESC)) AS `score_diff`
FROM scores;Result (partial):
name
subject
score
score_diff
Student1
化学
92
0
Student2
生物
92
0
Student5
英语
91
1
LEAD()
Similar to LAG() but looks forward to following rows.
SELECT name,subject,score,
score - LEAD(score, 1, score) OVER (ORDER BY score DESC) AS `score_diff`
FROM scores;Result (partial):
name
subject
score
score_diff
Student1
化学
92
0
Student2
生物
92
1
Student5
英语
91
1
The article also includes a sample CREATE TABLE statement for the scores table and several INSERT statements that populate it with example data.
CREATE TABLE `scores` (
`name` varchar(20) COLLATE utf8_bin NOT NULL,
`subject` varchar(20) COLLATE utf8_bin NOT NULL,
`score` int(3) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin; INSERT INTO scores (name, subject, score) VALUES ('Student1', '化学', 75);
INSERT INTO scores (name, subject, score) VALUES ('Student2', '生物', 92);
INSERT INTO scores (name, subject, score) VALUES ('Student3', '物理', 87);
INSERT INTO scores (name, subject, score) VALUES ('Student4', '数学', 68);
INSERT INTO scores (name, subject, score) VALUES ('Student5', '英语', 91);
INSERT INTO scores (name, subject, score) VALUES ('Student6', '化学', 58);
INSERT INTO scores (name, subject, score) VALUES ('Student7', '物理', 79);
INSERT INTO scores (name, subject, score) VALUES ('Student8', '数学', 90);
INSERT INTO scores (name, subject, score) VALUES ('Student9', '数学', 45);Overall, the article demonstrates how MySQL window functions can replace cumbersome subqueries, improve readability, and boost performance for common analytical tasks such as ranking, cumulative totals, and comparative calculations.
IT Services Circle
Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.