Databases 18 min read

Using MySQL Window Functions for Ranking, Aggregation, and Data Analysis

This article explains how MySQL 8.x window functions such as OVER, PARTITION BY, and ORDER BY can simplify complex ranking and aggregation queries, demonstrates creating a sample scores table, and provides practical examples of functions like ROW_NUMBER, RANK, DENSE_RANK, NTILE, LAG, and LEAD with their results.

IT Services Circle
IT Services Circle
IT Services Circle
Using MySQL Window Functions for Ranking, Aggregation, and Data Analysis

First, a complex SQL query that calculates a score ranking is shown, which is hard to read and maintain.

SELECT
  s1.name,
  s1.subject,
  s1.score,
  sub.avg_score AS average_score_per_subject,
  (SELECT COUNT(DISTINCT s2.score) + 1 FROM scores s2 WHERE s2.score > s1.score) AS score_rank
FROM scores s1
JOIN (
  SELECT subject, AVG(score) AS avg_score
  FROM scores
  GROUP BY subject
) sub ON s1.subject = sub.subject
ORDER BY s1.score DESC;

The purpose of this query is to compute a ranking for each student's score, which is unnecessarily complicated.

A simplified version using window functions is presented:

SELECT
  name,
  subject,
  score,
  AVG(score) OVER (PARTITION BY subject) AS average_score_per_subject,
  RANK() OVER (ORDER BY score DESC) AS score_rank
FROM scores
ORDER BY score DESC;

This version is much clearer.

What are Window Functions

MySQL 8.x introduces window functions, which perform calculations over a defined window of rows in the result set. Before MySQL 8.0, similar tasks required nested subqueries and joins, leading to unreadable "SQL as a piece of ancient script".

Window functions are mainly used for statistical and analytical calculations, offering better performance than traditional subqueries.

OVER() Clause

The OVER() clause defines the window for a function. It must be combined with a function such as SUM, AVG, ROW_NUMBER, etc., and specifies the range and ordering of rows.

function_name(...) OVER (
    [PARTITION BY expr_list]
    [ORDER BY expr_list]
    [range]
)

PARTITION BY

Specifies the column(s) used to partition the data. Each partition is processed independently.

ORDER BY

Orders rows within each partition, which can be combined with range or rows windows.

Range and Row Windows

Define the type of window: RANGE uses logical value ranges, ROWS uses physical row offsets.

Range Window Example

SUM(salary) OVER (ORDER BY id
                   RANGE BETWEEN 5 PRECEDING AND 5 FOLLOWING)

This calculates the sum of salary for the current row plus the five preceding and five following rows.

Row Window Example

SUM(salary) OVER (ORDER BY id
                   ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING)

This calculates the sum of salary for the current row plus the two preceding and two following rows.

Combining with Aggregate Functions

1. Partition by subject and get the maximum score per subject:

SELECT subject,score, MAX(score) OVER (PARTITION BY subject) as `subject_max_score`
FROM scores;

Result:

subject

score

subject_max_score

化学

75

75

化学

58

75

数学

68

90

数学

90

90

数学

45

90

物理

87

87

物理

79

87

生物

92

92

英语

91

91

2. Count the number of students per subject:

SELECT subject,score, count(name) OVER (PARTITION BY subject) as `subject_participants`
FROM scores;

Result:

subject

score

subject_participants

化学

75

2

化学

58

2

数学

68

3

数学

90

3

数学

45

3

物理

87

2

物理

79

2

生物

92

1

英语

91

1

3. Total score per subject:

SELECT subject, SUM(score) OVER (PARTITION BY subject) as `subject_total`
FROM scores;

Result:

subject

subject_total

化学

133

数学

203

物理

166

生物

92

英语

91

4. Cumulative sum ordered by score:

SELECT name,subject,score, SUM(score) OVER (order BY score) as `cumulative_score`
FROM scores;

Result (partial):

name

subject

score

cumulative_score

Student9

数学

45

45

Student6

化学

58

103

Student4

数学

68

171

5. Cumulative sum with a sliding window (current row plus one preceding and one following row):

SELECT name,subject,score, SUM(score) OVER (order BY `score` ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) as `cumulative_score`
FROM scores;

Result (partial):

name

subject

score

cumulative_score

Student9

数学

45

103

Student6

化学

58

171

Student4

数学

68

201

Combining with Ranking Functions

ROW_NUMBER()

Assigns a unique sequential number to each row based on the specified order.

SELECT name,subject,score, ROW_NUMBER() OVER (order BY `score` desc) as `rank`
FROM scores;

Result (partial):

name

subject

score

rank

Student2

生物

92

1

Student5

英语

91

2

Student8

数学

90

3

In MySQL 5.7, achieving the same result requires a self‑join and GROUP BY, which is far more complex.

RANK()

Provides ranking with gaps when there are ties (e.g., two rows share rank 1, the next rank is 3).

SELECT name,subject,score, RANK() OVER (order BY `score` desc) as `rank`
FROM scores;

Result (partial):

name

subject

score

rank

Student1

化学

92

1

Student2

生物

92

1

Student5

英语

91

3

DENSE_RANK()

Similar to RANK() but without gaps; after a tie, the next rank increments by 1.

SELECT name,subject,score, DENSE_RANK() OVER (order BY `score` desc) as `rank`
FROM scores;

Result (partial):

name

subject

score

rank

Student1

化学

92

1

Student2

生物

92

1

Student5

英语

91

2

NTILE()

Divides the result set into a specified number of buckets and assigns a bucket number to each row.

SELECT name,subject,score, NTILE(4) OVER (order BY `score` desc) as `group`
FROM scores;

Result (partial):

name

subject

score

group

Student1

化学

92

1

Student2

生物

92

1

Student5

英语

91

1

Student8

数学

90

2

LAG()

Accesses data from a preceding row within the same result set, useful for comparing adjacent rows.

SELECT name,subject,score,
       ABS(score - LAG(score, 1, score) OVER (ORDER BY score DESC)) AS `score_diff`
FROM scores;

Result (partial):

name

subject

score

score_diff

Student1

化学

92

0

Student2

生物

92

0

Student5

英语

91

1

LEAD()

Similar to LAG() but looks forward to following rows.

SELECT name,subject,score,
       score - LEAD(score, 1, score) OVER (ORDER BY score DESC) AS `score_diff`
FROM scores;

Result (partial):

name

subject

score

score_diff

Student1

化学

92

0

Student2

生物

92

1

Student5

英语

91

1

The article also includes a sample CREATE TABLE statement for the scores table and several INSERT statements that populate it with example data.

CREATE TABLE `scores` (
  `name` varchar(20) COLLATE utf8_bin NOT NULL,
  `subject` varchar(20) COLLATE utf8_bin NOT NULL,
  `score` int(3) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
INSERT INTO scores (name, subject, score) VALUES ('Student1', '化学', 75);
INSERT INTO scores (name, subject, score) VALUES ('Student2', '生物', 92);
INSERT INTO scores (name, subject, score) VALUES ('Student3', '物理', 87);
INSERT INTO scores (name, subject, score) VALUES ('Student4', '数学', 68);
INSERT INTO scores (name, subject, score) VALUES ('Student5', '英语', 91);
INSERT INTO scores (name, subject, score) VALUES ('Student6', '化学', 58);
INSERT INTO scores (name, subject, score) VALUES ('Student7', '物理', 79);
INSERT INTO scores (name, subject, score) VALUES ('Student8', '数学', 90);
INSERT INTO scores (name, subject, score) VALUES ('Student9', '数学', 45);

Overall, the article demonstrates how MySQL window functions can replace cumbersome subqueries, improve readability, and boost performance for common analytical tasks such as ranking, cumulative totals, and comparative calculations.

SQLRankingMySQLPartitionwindow-functionsAggregationOVER
IT Services Circle
Written by

IT Services Circle

Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.