Databases 18 min read

Using MySQL Window Functions for Ranking, Aggregation, and Data Analysis

This article explains how MySQL 8.x window functions such as OVER, PARTITION BY, and ORDER BY can simplify complex ranking and aggregation queries, demonstrates creating a sample scores table, and provides practical examples of functions like ROW_NUMBER, RANK, DENSE_RANK, NTILE, LAG, and LEAD with their results.

IT Services Circle

Jul 16, 2023

Using MySQL Window Functions for Ranking, Aggregation, and Data Analysis

First, a complex SQL query that calculates a score ranking is shown, which is hard to read and maintain.

SELECT
  s1.name,
  s1.subject,
  s1.score,
  sub.avg_score AS average_score_per_subject,
  (SELECT COUNT(DISTINCT s2.score) + 1 FROM scores s2 WHERE s2.score > s1.score) AS score_rank
FROM scores s1
JOIN (
  SELECT subject, AVG(score) AS avg_score
  FROM scores
  GROUP BY subject
) sub ON s1.subject = sub.subject
ORDER BY s1.score DESC;

The purpose of this query is to compute a ranking for each student's score, which is unnecessarily complicated.

A simplified version using window functions is presented:

SELECT
  name,
  subject,
  score,
  AVG(score) OVER (PARTITION BY subject) AS average_score_per_subject,
  RANK() OVER (ORDER BY score DESC) AS score_rank
FROM scores
ORDER BY score DESC;

This version is much clearer.

What are Window Functions

MySQL 8.x introduces window functions, which perform calculations over a defined window of rows in the result set. Before MySQL 8.0, similar tasks required nested subqueries and joins, leading to unreadable "SQL as a piece of ancient script".

Window functions are mainly used for statistical and analytical calculations, offering better performance than traditional subqueries.

OVER() Clause

The OVER() clause defines the window for a function. It must be combined with a function such as SUM, AVG, ROW_NUMBER, etc., and specifies the range and ordering of rows.

function_name(...) OVER (
    [PARTITION BY expr_list]
    [ORDER BY expr_list]
    [range]
)

PARTITION BY

Specifies the column(s) used to partition the data. Each partition is processed independently.

ORDER BY

Orders rows within each partition, which can be combined with range or rows windows.

Range and Row Windows

Define the type of window: RANGE uses logical value ranges, ROWS uses physical row offsets.

Range Window Example

SUM(salary) OVER (ORDER BY id
                   RANGE BETWEEN 5 PRECEDING AND 5 FOLLOWING)

This calculates the sum of salary for the current row plus the five preceding and five following rows.

Row Window Example

SUM(salary) OVER (ORDER BY id
                   ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING)

This calculates the sum of salary for the current row plus the two preceding and two following rows.

Combining with Aggregate Functions

1. Partition by subject and get the maximum score per subject:

SELECT subject,score, MAX(score) OVER (PARTITION BY subject) as `subject_max_score`
FROM scores;

Result:

subject

score

subject_max_score

化学

数学

物理

生物

英语

2. Count the number of students per subject:

SELECT subject,score, count(name) OVER (PARTITION BY subject) as `subject_participants`
FROM scores;

Result:

subject

score

subject_participants

化学

数学

物理

生物

英语

3. Total score per subject:

SELECT subject, SUM(score) OVER (PARTITION BY subject) as `subject_total`
FROM scores;

Result:

subject

subject_total

化学

133

数学

203

物理

166

生物

英语

4. Cumulative sum ordered by score:

SELECT name,subject,score, SUM(score) OVER (order BY score) as `cumulative_score`
FROM scores;

Result (partial):

name

subject

score

cumulative_score

Student9

数学

Student6

化学

103

Student4

数学

171

5. Cumulative sum with a sliding window (current row plus one preceding and one following row):

SELECT name,subject,score, SUM(score) OVER (order BY `score` ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) as `cumulative_score`
FROM scores;

Result (partial):

name

subject

score

cumulative_score

Student9

数学

103

Student6

化学

171

Student4

数学

201

Combining with Ranking Functions

ROW_NUMBER()

Assigns a unique sequential number to each row based on the specified order.

SELECT name,subject,score, ROW_NUMBER() OVER (order BY `score` desc) as `rank`
FROM scores;

Result (partial):

name

subject

score

rank

Student2

生物

Student5

英语

Student8

数学

In MySQL 5.7, achieving the same result requires a self‑join and GROUP BY, which is far more complex.

RANK()

Provides ranking with gaps when there are ties (e.g., two rows share rank 1, the next rank is 3).

SELECT name,subject,score, RANK() OVER (order BY `score` desc) as `rank`
FROM scores;

Result (partial):

name

subject

score

rank

Student1

化学

Student2

生物

Student5

英语

DENSE_RANK()

Similar to RANK() but without gaps; after a tie, the next rank increments by 1.

SELECT name,subject,score, DENSE_RANK() OVER (order BY `score` desc) as `rank`
FROM scores;

Result (partial):

name

subject

score

rank

Student1

化学

Student2

生物

Student5

英语

NTILE()

Divides the result set into a specified number of buckets and assigns a bucket number to each row.

SELECT name,subject,score, NTILE(4) OVER (order BY `score` desc) as `group`
FROM scores;

Result (partial):

name

subject

score

group

Student1

化学

Student2

生物

Student5

英语

Student8

数学

LAG()

Accesses data from a preceding row within the same result set, useful for comparing adjacent rows.

SELECT name,subject,score,
       ABS(score - LAG(score, 1, score) OVER (ORDER BY score DESC)) AS `score_diff`
FROM scores;

Result (partial):

name

subject

score

score_diff

Student1

化学

Student2

生物

Student5

英语

LEAD()

Similar to LAG() but looks forward to following rows.

SELECT name,subject,score,
       score - LEAD(score, 1, score) OVER (ORDER BY score DESC) AS `score_diff`
FROM scores;

Result (partial):

name

subject

score

score_diff

Student1

化学

Student2

生物

Student5

英语

The article also includes a sample CREATE TABLE statement for the scores table and several INSERT statements that populate it with example data.

CREATE TABLE `scores` (
  `name` varchar(20) COLLATE utf8_bin NOT NULL,
  `subject` varchar(20) COLLATE utf8_bin NOT NULL,
  `score` int(3) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;

INSERT INTO scores (name, subject, score) VALUES ('Student1', '化学', 75);
INSERT INTO scores (name, subject, score) VALUES ('Student2', '生物', 92);
INSERT INTO scores (name, subject, score) VALUES ('Student3', '物理', 87);
INSERT INTO scores (name, subject, score) VALUES ('Student4', '数学', 68);
INSERT INTO scores (name, subject, score) VALUES ('Student5', '英语', 91);
INSERT INTO scores (name, subject, score) VALUES ('Student6', '化学', 58);
INSERT INTO scores (name, subject, score) VALUES ('Student7', '物理', 79);
INSERT INTO scores (name, subject, score) VALUES ('Student8', '数学', 90);
INSERT INTO scores (name, subject, score) VALUES ('Student9', '数学', 45);

Overall, the article demonstrates how MySQL window functions can replace cumbersome subqueries, improve readability, and boost performance for common analytical tasks such as ranking, cumulative totals, and comparative calculations.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

SQL Ranking MySQL Partition Window Functions Aggregation OVER

Written by

IT Services Circle

Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.