Databases 14 min read

Ten Advanced SQL Concepts Every Data‑Science Interview Candidate Should Master

This article explains ten advanced SQL techniques—including common table expressions, recursive CTEs, temporary functions, CASE‑WHEN pivots, EXCEPT vs NOT IN, self‑joins, ranking functions, delta calculations, cumulative sums, and date‑time manipulation—providing clear explanations and complete query examples to help readers ace data‑science interview questions.

Top Architect
Top Architect
Top Architect
Ten Advanced SQL Concepts Every Data‑Science Interview Candidate Should Master

Below is a concise guide to ten advanced SQL concepts that are frequently asked in data‑science interviews.

1. Common Table Expressions (CTEs)

CTEs let you define a temporary result set that can be referenced within a SELECT, INSERT, UPDATE, or DELETE statement, making queries more modular and easier to read.

Example:

SELECT name, salary
FROM People
WHERE NAME IN (
    SELECT DISTINCT NAME
    FROM population
    WHERE country = "Canada" AND city = "Toronto"
)
AND salary >= (
    SELECT AVG(salary)
    FROM salaries
    WHERE gender = "Female"
)

The query uses a sub‑query in the WHERE clause; CTEs would simplify this by breaking it into named parts.

with toronto_ppl as (
    SELECT DISTINCT name
    FROM population
    WHERE country = "Canada" AND city = "Toronto"
), avg_female_salary as (
    SELECT AVG(salary) as avgSalary
    FROM salaries
    WHERE gender = "Female"
)
SELECT name, salary
FROM People
WHERE name in (SELECT name FROM toronto_ppl)
AND salary >= (SELECT avgSalary FROM avg_female_salary)

CTEs improve readability and allow you to assign meaningful names such as toronto_ppl and avg_female_salary .

2. Recursive CTEs

A recursive CTE references itself, similar to a recursive function in programming, and is useful for hierarchical data such as organization charts or file‑system trees.

Recursive CTEs consist of three parts: an anchor query, a recursive member, and a termination condition.

Example – retrieving each employee’s manager chain:

with org_structure as (
    SELECT id, manager_id
    FROM staff_members
    WHERE manager_id IS NULL
    UNION ALL
    SELECT sm.id, sm.manager_id
    FROM staff_members sm
    INNER JOIN org_structure os ON os.id = sm.manager_id
)

3. Temporary Functions

Temporary (inline) functions let you encapsulate reusable logic inside a query, improving clarity and avoiding repetition.

Example – mapping tenure to seniority level using a CASE expression:

SELECT name,
       CASE WHEN tenure < 1 THEN "analyst"
            WHEN tenure BETWEEN 1 AND 3 THEN "associate"
            WHEN tenure BETWEEN 3 AND 5 THEN "senior"
            WHEN tenure > 5 THEN "vp"
            ELSE "n/a"
       END AS seniority
FROM employees

Creating a temporary function for the same logic:

CREATE TEMPORARY FUNCTION get_seniority(tenure INT64) AS (
    CASE WHEN tenure < 1 THEN "analyst"
         WHEN tenure BETWEEN 1 AND 3 THEN "associate"
         WHEN tenure BETWEEN 3 AND 5 THEN "senior"
         WHEN tenure > 5 THEN "vp"
         ELSE "n/a"
    END
);
SELECT name, get_seniority(tenure) AS seniority
FROM employees;

4. CASE WHEN Pivoting Data

CASE WHEN can also be used to pivot rows into columns. For example, turning a month column into separate revenue columns per month.

-- Input table
+----+---------+-------+
| id | revenue | month |
+----+---------+-------+
| 1  | 8000    | Jan   |
| 2  | 9000    | Jan   |
| 3  | 10000   | Feb   |
| 1  | 7000    | Feb   |
| 1  | 6000    | Mar   |
+----+---------+-------+

-- Pivot query
SELECT id,
       SUM(CASE WHEN month = 'Jan' THEN revenue END) AS Jan_Revenue,
       SUM(CASE WHEN month = 'Feb' THEN revenue END) AS Feb_Revenue,
       SUM(CASE WHEN month = 'Mar' THEN revenue END) AS Mar_Revenue
FROM sales
GROUP BY id;

5. EXCEPT vs NOT IN

Both operators compare two result sets, but they differ in handling duplicates and NULLs. EXCEPT removes duplicates and returns rows present in the first query but not the second, while NOT IN returns rows where the column value is not present in the sub‑query, with special NULL semantics.

6. Self‑Join

A self‑join joins a table to itself, useful for hierarchical relationships such as employees and their managers.

Example – find employees whose salary exceeds that of their manager:

SELECT a.Name AS Employee
FROM Employee a
JOIN Employee b ON a.ManagerID = b.Id
WHERE a.Salary > b.Salary;

7. ROW_NUMBER, RANK, DENSE_RANK

These window functions assign ranking numbers to rows based on an ORDER BY clause.

ROW_NUMBER() gives a unique sequential number to each row.

RANK() assigns the same rank to ties and leaves gaps.

DENSE_RANK() also assigns the same rank to ties but without gaps.

SELECT Name, GPA,
       ROW_NUMBER() OVER (ORDER BY GPA DESC) AS row_num,
       RANK() OVER (ORDER BY GPA DESC) AS rnk,
       DENSE_RANK() OVER (ORDER BY GPA DESC) AS dense_rnk
FROM student_grades;

8. Calculating Deltas

Use LAG() or LEAD() to compare a value with its previous (or next) row, e.g., month‑over‑month sales change.

# Compare each month’s sales to the previous month
SELECT month,
       sales,
       sales - LAG(sales, 1) OVER (ORDER BY month) AS delta_month
FROM monthly_sales;

# Compare each month’s sales to the same month last year
SELECT month,
       sales,
       sales - LAG(sales, 12) OVER (ORDER BY month) AS delta_year
FROM monthly_sales;

9. Cumulative Totals

Window SUM() can compute running totals.

SELECT Month,
       Revenue,
       SUM(Revenue) OVER (ORDER BY Month) AS Cumulative
FROM monthly_revenue;

10. Date‑Time Manipulation

Typical interview tasks involve extracting parts of dates or comparing rows to the previous day.

SELECT a.Id
FROM Weather a, Weather b
WHERE a.Temperature > b.Temperature
  AND DATEDIFF(a.RecordDate, b.RecordDate) = 1;

Mastering these concepts will greatly improve your ability to solve real‑world SQL interview problems.

SQLDatabasewindow functionscteData Science Interviews
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.