Master 10 Essential Advanced SQL Concepts for Data Professionals
This article presents ten crucial intermediate‑to‑advanced SQL techniques—including CTEs, recursive CTEs, temporary functions, CASE‑WHEN pivots, EXCEPT vs NOT IN, self‑joins, ranking window functions, delta calculations, cumulative sums, and date‑time manipulation—to help data professionals ace interview questions and write cleaner, more powerful queries.
As data volumes continue to grow, the demand for qualified data professionals—especially those fluent in SQL—also rises. Nathan Rosidi, founder of Stratascratch, and the author highlight ten of the most important intermediate to advanced SQL concepts.
1. Common Table Expressions (CTEs)
CTEs act like temporary tables, allowing you to modularize and break down complex queries. They are useful for sub‑queries and make the logic easier to read.
SELECT name,
salary
FROM People
WHERE NAME IN (
SELECT DISTINCT NAME FROM population WHERE country = "Canada" AND city = "Toronto"
) AND salary >= (
SELECT AVG(salary) FROM salaries WHERE gender = "Female"
);CTEs simplify queries with multiple sub‑queries by defining reusable result sets.
WITH toronto_ppl AS (
SELECT DISTINCT name FROM population WHERE country = "Canada" AND city = "Toronto"
), avg_female_salary AS (
SELECT AVG(salary) AS avgSalary FROM salaries WHERE gender = "Female"
)
SELECT name, salary
FROM People
WHERE name IN (SELECT name FROM toronto_ppl)
AND salary >= (SELECT avgSalary FROM avg_female_salary);2. Recursive CTEs
Recursive CTEs reference themselves, similar to recursive functions in programming languages, and are ideal for hierarchical data such as organization charts or file‑system trees.
Anchor member – the base query returning the initial result set.
Recursive member – the query that references the CTE.
Termination condition – stops the recursion.
WITH org_structure AS (
SELECT id, manager_id FROM staff_members WHERE manager_id IS NULL
UNION ALL
SELECT sm.id, sm.manager_id
FROM staff_members sm
JOIN org_structure os ON os.id = sm.manager_id
)
SELECT * FROM org_structure;3. Temporary Functions
Temporary functions let you encapsulate reusable logic within a query, improving readability and reducing duplication.
CREATE TEMPORARY FUNCTION get_seniority(tenure INT64) AS (
CASE
WHEN tenure < 1 THEN "analyst"
WHEN tenure BETWEEN 1 AND 3 THEN "associate"
WHEN tenure BETWEEN 3 AND 5 THEN "senior"
WHEN tenure > 5 THEN "vp"
ELSE "n/a"
END
);
SELECT name, get_seniority(tenure) AS seniority FROM employees;4. CASE‑WHEN Pivoting Data
Using CASE‑WHEN you can transform rows into columns, for example turning a month column into separate revenue columns for each month.
-- Input table: id, revenue, month
-- Output table: id, Jan_Revenue, Feb_Revenue, Mar_Revenue, ...
SELECT id,
MAX(CASE WHEN month = 'Jan' THEN revenue END) AS Jan_Revenue,
MAX(CASE WHEN month = 'Feb' THEN revenue END) AS Feb_Revenue,
MAX(CASE WHEN month = 'Mar' THEN revenue END) AS Mar_Revenue
FROM sales
GROUP BY id;5. EXCEPT vs NOT IN
Both operators compare rows between two queries, but EXCEPT removes duplicates and returns distinct rows, while NOT IN works column‑by‑column and retains duplicates.
6. Self‑Join
A self‑join joins a table to itself, useful when hierarchical relationships are stored in a single table.
SELECT a.Name AS Employee
FROM Employee a
JOIN Employee b ON a.ManagerID = b.Id
WHERE a.Salary > b.Salary;7. Rank vs Dense_Rank vs Row_Number
These window functions assign ranking numbers to rows. ROW_NUMBER() gives a unique sequential number, RANK() gives the same rank to ties and leaves gaps, while DENSE_RANK() gives the same rank to ties without gaps.
8. Calculating Delta Values
Use LAG() or LEAD() to compare a value with its previous (or next) row, useful for month‑over‑month or year‑over‑year analyses.
# Compare each month's sales to last month
SELECT month, sales,
sales - LAG(sales, 1) OVER (ORDER BY month) AS month_over_month_delta
FROM monthly_sales;
# Compare each month's sales to the same month last year
SELECT month, sales,
sales - LAG(sales, 12) OVER (ORDER BY month) AS year_over_year_delta
FROM monthly_sales;9. Cumulative Totals
Window SUM() can compute running totals across ordered rows.
SELECT Month,
Revenue,
SUM(Revenue) OVER (ORDER BY Month) AS Cumulative
FROM monthly_revenue;10. Date‑Time Manipulation
Common functions include DATE_TRUNC, DATE_ADD, DATE_SUB, and extracting components to group or reformat dates.
SELECT Id
FROM Weather a
JOIN Weather b ON DATEDIFF(a.RecordDate, b.RecordDate) = 1
WHERE a.Temperature > b.Temperature;Mastering these concepts will greatly improve your ability to solve SQL interview questions and write clean, efficient queries.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
