10 Advanced SQL Query Techniques Every Data Professional Should Master
This article presents ten advanced SQL concepts—including CTEs, recursive CTEs, temporary functions, CASE‑WHEN pivots, EXCEPT vs NOT IN, self‑joins, ranking window functions, delta calculations, cumulative sums, and date‑time manipulation—each explained with concrete examples and code snippets.
1. Common Table Expressions (CTEs)
CTEs create temporary result sets that can be referenced later in the query, making complex sub‑queries easier to read and modularize.
SELECT
name,
salary
FROM People
WHERE NAME IN (
SELECT DISTINCT NAME FROM population WHERE country = "Canada" AND city = "Toronto"
) AND salary >= (
SELECT AVG(salary)
FROM salaries
WHERE gender = "Female"
); WITH toronto_ppl AS (
SELECT DISTINCT name FROM population WHERE country = "Canada" AND city = "Toronto"
), avg_female_salary AS (
SELECT AVG(salary) AS avgSalary FROM salaries WHERE gender = "Female"
)
SELECT name, salary
FROM People
WHERE name IN (SELECT DISTINCT FROM toronto_ppl)
AND salary >= (SELECT avgSalary FROM avg_female_salary);2. Recursive CTEs
A recursive CTE references itself, similar to a recursive function, and is useful for hierarchical data such as organization charts.
Anchor member – returns the base rows.
Recursive member – joins the CTE to itself.
Termination condition – stops recursion.
WITH org_structure AS (
SELECT id, manager_id FROM staff_members WHERE manager_id IS NULL
UNION ALL
SELECT sm.id, sm.manager_id
FROM staff_members sm
INNER JOIN org_structure os ON os.id = sm.manager_id
)3. Temporary Functions
Temporary functions let you encapsulate reusable logic inside a query, improving readability and avoiding repetition.
Break code into smaller blocks.
Promote clean code.
Reuse logic like a Python function.
SELECT name,
CASE WHEN tenure < 1 THEN "analyst"
WHEN tenure BETWEEN 1 AND 3 THEN "associate"
WHEN tenure BETWEEN 3 AND 5 THEN "senior"
WHEN tenure > 5 THEN "vp"
ELSE "n/a"
END AS seniority
FROM employees; CREATE TEMPORARY FUNCTION get_seniority(tenure INT64) AS (
CASE WHEN tenure < 1 THEN "analyst"
WHEN tenure BETWEEN 1 AND 3 THEN "associate"
WHEN tenure BETWEEN 3 AND 5 THEN "senior"
WHEN tenure > 5 THEN "vp"
ELSE "n/a"
END
);
SELECT name, get_seniority(tenure) AS seniority FROM employees;4. CASE WHEN for Data Pivot
CASE WHEN can be used to pivot rows into columns, turning a vertical month column into separate monthly revenue columns.
Initial table:
+------+---------+-------+
| id | revenue | month |
+------+---------+-------+
| 1 | 8000 | Jan |
| 2 | 9000 | Jan |
| 3 | 10000 | Feb |
| 1 | 7000 | Feb |
| 1 | 6000 | Mar |
+------+---------+-------+
Result table:
+------+-------------+-------------+-------------+-----+-----------+
| id | Jan_Revenue | Feb_Revenue | Mar_Revenue | ... | Dec_Revenue |
+------+-------------+-------------+-------------+-----+-----------+
| 1 | 8000 | 7000 | 6000 | ... | null |
| 2 | 9000 | null | null | ... | null |
| 3 | null | 10000 | null | ... | null |
+------+-------------+-------------+-------------+-----+-----------+5. EXCEPT vs NOT IN
Both operators compare rows between two queries, but EXCEPT removes duplicates and returns distinct rows, while NOT IN filters rows that do not appear in the sub‑query.
6. Self Join
A self‑join joins a table to itself, useful when the data resides in a single large table and hierarchical relationships need to be examined.
+----+-------+--------+-----------+
| Id | Name | Salary | ManagerId |
+----+-------+--------+-----------+
| 1 | Joe | 70000 | 3 |
| 2 | Henry | 80000 | 4 |
| 3 | Sam | 60000 | NULL |
| 4 | Max | 90000 | NULL |
+----+-------+--------+-----------+
SELECT a.Name AS Employee
FROM Employee a
JOIN Employee b ON a.ManagerID = b.Id
WHERE a.Salary > b.Salary;7. Rank vs Dense Rank vs Row Number
These window functions assign ranking numbers to rows based on an ordering column.
SELECT Name,
GPA,
ROW_NUMBER() OVER (ORDER BY GPA DESC),
RANK() OVER (ORDER BY GPA DESC),
DENSE_RANK() OVER (ORDER BY GPA DESC)
FROM student_grades;ROW_NUMBER returns a unique sequential number; RANK gives the same number to ties and leaves gaps; DENSE_RANK also gives the same number to ties but without gaps.
8. Calculating Delta Values
Use LAG (or LEAD) to compare a value with its previous row, useful for month‑over‑month or year‑over‑year differences.
# Compare each month's sales to last month
SELECT month,
sales,
sales - LAG(sales, 1) OVER (ORDER BY month)
FROM monthly_sales;
# Compare each month's sales to the same month last year
SELECT month,
sales,
sales - LAG(sales, 12) OVER (ORDER BY month)
FROM monthly_sales;9. Running Totals
The cumulative sum window function SUM() OVER (ORDER BY …) computes a running total.
SELECT Month,
Revenue,
SUM(Revenue) OVER (ORDER BY Month) AS Cumulative
FROM monthly_revenue;10. Date‑Time Manipulation
Common date‑time functions include extraction, epoch conversion, DATE_ADD, DATE_SUB, and DATE_TRUNC. They help group data by periods or reformat strings.
EXTRACT
EXTRACT(EPOCH)
date_add, date_sub
date_trunc
SELECT a.Id
FROM Weather a, Weather b
WHERE a.Temperature > b.Temperature
AND DATEDIFF(a.RecordDate, b.RecordDate) = 1;Linux Tech Enthusiast
Focused on sharing practical Linux technology content, covering Linux fundamentals, applications, tools, as well as databases, operating systems, network security, and other technical knowledge.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
