Master SQL Window Functions: From Basics to Advanced Ranking Queries
This article explains what SQL window functions are, shows their syntax, and provides step‑by‑step basic and advanced examples—including how to rank rows within groups and reset rankings when key columns change—complete with runnable SQL scripts and result screenshots.
Introduction
The author shares a practical tutorial on SQL window functions, emphasizing their importance for interview preparation and data analysis.
What are window functions?
Window functions include dedicated functions such as row_number(), rank(), dense_rank() and aggregate functions like sum(), avg(). They allow calculations across a set of rows related to the current row without collapsing the result set.
Window function syntax
row_number() over([partition by dimension] order by dimension asc [desc])
partition bygroups rows by the specified dimension (optional), while order by defines the ordering within each partition (required).
Basic example
Given a table test with columns time, id, and cat, the following script ranks rows within each (id, cat) group by time:
# First step – create sample data
insert into test values('2020-10-02 12:30:45','A','AAA');
insert into test values('2020-10-02 12:30:55','A','AAA');
insert into test values('2020-10-02 14:39:45','A','BBB');
insert into test values('2020-10-02 14:40:55','A','BBB');
insert into test values('2020-10-02 15:30:05','A','AAA');
insert into test values('2020-10-02 16:30:45','B','AAA');
insert into test values('2020-10-02 17:04:45','B','BBB');
# Query using window function
select
time,
id,
cat,
row_number() over(partition by id, cat order by time asc) as rnk
from test
order by time asc;The resulting ranking is illustrated in the following image:
Advanced example
Problem: rank rows by time, resetting the rank whenever either id or cat changes compared to the previous row.
# 1. Construct the same sample data as above (omitted for brevity)
with temp1 as (
select
time,
id,
category,
concat_ws('-', id, category) as add_col,
row_number() over(order by time asc) as order_rnk
from test
),
temp2 as (
select
time,
id,
category,
add_col,
order_rnk,
order_rnk - lag(order_rnk, 1, order_rnk-1) over(partition by add_col order by time asc) as order_rnk_lag1
from temp1
),
temp3 as (
select
time,
id,
catgory,
row_number() over(partition by concat(add_col, order_rnk_lag1) order by time asc) as rnk
from temp2
)
select * from temp3;This script uses lag() to detect changes in the auxiliary column, creates a composite grouping key, and then applies row_number() to generate the desired ranking. The final ranking is shown in the image below:
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
