Databases 11 min read

10 Advanced SQL Concepts Every Data Scientist Should Master

This guide walks through ten essential advanced SQL techniques—including CTEs, recursive CTEs, temporary functions, CASE‑WHEN pivots, EXCEPT vs NOT IN, self‑joins, ranking functions, delta calculations with LAG/LEAD, cumulative sums, and date‑time manipulation—to help data professionals ace interview challenges and write cleaner, more powerful queries.

Liangxu Linux
Liangxu Linux
Liangxu Linux
10 Advanced SQL Concepts Every Data Scientist Should Master

1. Common Table Expressions (CTEs)

CTEs let you break complex sub‑queries into reusable temporary result sets, making queries easier to read and maintain. Example:

SELECT
    name,
    salary
FROM
    People
WHERE
    NAME IN (SELECT DISTINCT NAME FROM population WHERE country = "Canada" AND city = "Toronto")
  AND salary >= (
    SELECT AVG(salary)
    FROM salaries
    WHERE gender = "Female"
);

A recursive version demonstrates how to chain CTEs:

WITH toronto_ppl AS (
    SELECT DISTINCT name FROM population WHERE country = "Canada" AND city = "Toronto"
), avg_female_salary AS (
    SELECT AVG(salary) AS avgSalary FROM salaries WHERE gender = "Female"
)
SELECT name, salary
FROM People
WHERE name IN (SELECT name FROM toronto_ppl)
  AND salary >= (SELECT avgSalary FROM avg_female_salary);

2. Recursive CTEs

Recursive CTEs reference themselves, similar to recursive functions in programming languages, and are ideal for traversing hierarchical data such as org charts or file systems.

Anchor member – returns the base rows.

Recursive member – joins the CTE to itself to produce the next level.

Termination condition – stops recursion.

Example that retrieves each employee’s manager ID:

WITH org_structure AS (
    SELECT id, manager_id FROM staff_members WHERE manager_id IS NULL
    UNION ALL
    SELECT sm.id, sm.manager_id
    FROM staff_members sm
    INNER JOIN org_structure os ON os.id = sm.manager_id
);

3. Temporary Functions

Temporary (inline) functions let you encapsulate reusable logic within a query, improving readability and avoiding repetition.

SELECT name,
       CASE WHEN tenure < 1 THEN "analyst"
            WHEN tenure BETWEEN 1 AND 3 THEN "associate"
            WHEN tenure BETWEEN 3 AND 5 THEN "senior"
            WHEN tenure > 5 THEN "vp"
            ELSE "n/a"
       END AS seniority
FROM employees;

Using a temporary function:

CREATE TEMPORARY FUNCTION get_seniority(tenure INT64) AS (
   CASE WHEN tenure < 1 THEN "analyst"
        WHEN tenure BETWEEN 1 AND 3 THEN "associate"
        WHEN tenure BETWEEN 3 AND 5 THEN "senior"
        WHEN tenure > 5 THEN "vp"
        ELSE "n/a"
   END
);
SELECT name, get_seniority(tenure) AS seniority FROM employees;

4. Pivoting Data with CASE WHEN

CASE WHEN can be used to transform rows into columns. Example: turning a month column into separate revenue columns for each month.

-- Input table
+----+--------+-------+
| id | revenue| month |
+----+--------+-------+
| 1  | 8000   | Jan   |
| 2  | 9000   | Jan   |
| 3  | 10000  | Feb   |
| 1  | 7000   | Feb   |
| 1  | 6000   | Mar   |
+----+--------+-------+

-- Pivot query
SELECT
    id,
    MAX(CASE WHEN month = 'Jan' THEN revenue END) AS Jan_Revenue,
    MAX(CASE WHEN month = 'Feb' THEN revenue END) AS Feb_Revenue,
    MAX(CASE WHEN month = 'Mar' THEN revenue END) AS Mar_Revenue
FROM sales
GROUP BY id;

5. EXCEPT vs NOT IN

Both operators compare two result sets, but EXCEPT removes duplicates and returns rows present in the first query but not the second, while NOT IN checks for non‑membership on a per‑row basis and can behave differently with NULLs.

6. Self‑Join

A self‑join links a table to itself, useful when hierarchical relationships are stored in a single table.

Example: find employees whose salary exceeds their manager’s salary.

SELECT a.Name AS Employee
FROM Employee a
JOIN Employee b ON a.ManagerID = b.Id
WHERE a.Salary > b.Salary;

7. Rank, Dense_Rank, and Row_Number

These window functions assign ranking numbers to rows based on an ordering column.

SELECT Name,
       GPA,
       ROW_NUMBER() OVER (ORDER BY GPA DESC) AS row_num,
       RANK() OVER (ORDER BY GPA DESC) AS rank,
       DENSE_RANK() OVER (ORDER BY GPA DESC) AS dense_rank
FROM student_grades;

8. Calculating Deltas with LAG/LEAD

LAG and LEAD let you compare a row’s value with a previous or next row, useful for month‑over‑month or year‑over‑year differences.

# Compare each month’s sales to the previous month
SELECT month,
       sales,
       sales - LAG(sales, 1) OVER (ORDER BY month) AS month_delta
FROM monthly_sales;

# Compare each month’s sales to the same month last year
SELECT month,
       sales,
       sales - LAG(sales, 12) OVER (ORDER BY month) AS year_delta
FROM monthly_sales;

9. Cumulative Totals

Use the SUM window function to compute running totals.

SELECT Month,
       Revenue,
       SUM(Revenue) OVER (ORDER BY Month) AS Cumulative
FROM monthly_revenue;

10. Date‑Time Manipulation

Common functions for handling dates include EXTRACT, DATE_ADD, DATE_SUB, and DATE_TRUNC.

Example: find days where temperature is higher than the previous day.

SELECT a.Id
FROM Weather a, Weather b
WHERE a.Temperature > b.Temperature
  AND DATEDIFF(a.RecordDate, b.RecordDate) = 1;
Source: towardsdatascience.com/ten-advanced-sql-concepts-you-should-know-for-data-science-interviews-4d7015ec74b0
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

SQLdatabaseData ScienceWindow FunctionsCTEAdvanced SQL
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.