Databases 11 min read

10 Advanced MySQL Query Techniques Every Data Engineer Should Know

This article presents ten essential advanced MySQL concepts—including CTEs, recursive CTEs, temporary functions, CASE‑WHEN pivots, EXCEPT vs NOT IN, self‑joins, ranking window functions, delta calculations, running totals, and date‑time manipulation—each explained with clear examples and practical SQL snippets.

Programmer XiaoFu
Programmer XiaoFu
Programmer XiaoFu
10 Advanced MySQL Query Techniques Every Data Engineer Should Know

As data volumes grow, the demand for professionals fluent in SQL continues to rise, especially at the intermediate and advanced levels. Drawing on insights from Stratascratch founder Nathan Rosidi, the author lists ten crucial MySQL concepts for data‑science interviews.

1. Common Table Expressions (CTEs)

CTEs let you break complex queries into modular, temporary result sets, similar to dividing an article into sections. They simplify nested subqueries and improve readability.

SELECT name, salary
FROM People
WHERE name IN (
    SELECT DISTINCT name FROM population
    WHERE country = "Canada" AND city = "Toronto"
)
AND salary >= (
    SELECT AVG(salary) FROM salaries WHERE gender = "Female"
);

A CTE version makes the same logic clearer:

WITH toronto_ppl AS (
    SELECT DISTINCT name FROM population WHERE country = "Canada" AND city = "Toronto"
), avg_female_salary AS (
    SELECT AVG(salary) AS avgSalary FROM salaries WHERE gender = "Female"
)
SELECT name, salary
FROM People
WHERE name IN (SELECT name FROM toronto_ppl)
AND salary >= (SELECT avgSalary FROM avg_female_salary);

2. Recursive CTEs

Recursive CTEs reference themselves, enabling hierarchical queries such as organizational charts or file‑system trees. They consist of three parts: an anchor query, a recursive member, and a termination condition.

WITH org_structure AS (
    SELECT id, manager_id FROM staff_members WHERE manager_id IS NULL
    UNION ALL
    SELECT sm.id, sm.manager_id
    FROM staff_members sm
    INNER JOIN org_structure os ON os.id = sm.manager_id
)

3. Temporary Functions

Temporary functions let you encapsulate reusable logic within a query, keeping code clean and avoiding repetition.

SELECT name,
       CASE WHEN tenure < 1 THEN "analyst"
            WHEN tenure BETWEEN 1 AND 3 THEN "associate"
            WHEN tenure BETWEEN 3 AND 5 THEN "senior"
            WHEN tenure > 5 THEN "vp"
            ELSE "n/a"
       END AS seniority
FROM employees;

Using a temporary function the same logic becomes:

CREATE TEMPORARY FUNCTION get_seniority(tenure INT64) AS (
    CASE WHEN tenure < 1 THEN "analyst"
         WHEN tenure BETWEEN 1 AND 3 THEN "associate"
         WHEN tenure BETWEEN 3 AND 5 THEN "senior"
         WHEN tenure > 5 THEN "vp"
         ELSE "n/a"
    END
);
SELECT name, get_seniority(tenure) AS seniority FROM employees;

4. CASE WHEN Pivoting Data

Beyond conditional logic, CASE WHEN can pivot rows into columns. For a table with monthly revenue rows, the query produces one column per month.

-- Input
+----+----------+-------+
| id | revenue  | month |
+----+----------+-------+
| 1  | 8000     | Jan   |
| 2  | 9000     | Jan   |
| 3  | 10000    | Feb   |
| 1  | 7000     | Feb   |
| 1  | 6000     | Mar   |
+----+----------+-------+

-- Output
+----+------------+------------+------------+
| id | Jan_Revenue| Feb_Revenue| Mar_Revenue|
+----+------------+------------+------------+
| 1  | 8000       | 7000       | 6000       |
| 2  | 9000       | NULL       | NULL       |
| 3  | NULL       | 10000      | NULL       |
+----+------------+------------+------------+

5. EXCEPT vs NOT IN

Both operators compare rows between two queries, but they differ subtly. EXCEPT removes duplicates and returns rows present in the first query but not the second, while NOT IN filters rows where a column value does not appear in a subquery result. They also behave differently when column counts differ.

6. Self‑Join

A self‑join links a table to itself, useful when hierarchical relationships are stored in a single table. Example: find employees whose salary exceeds that of their manager.

SELECT a.Name AS Employee
FROM Employee AS a
JOIN Employee AS b ON a.ManagerID = b.Id
WHERE a.Salary > b.Salary;

7. Rank vs Dense_Rank vs Row_Number

These window functions assign ranking numbers to rows. ROW_NUMBER() gives a unique sequential number, RANK() gives the same number to ties and leaves gaps, and DENSE_RANK() gives the same number to ties without gaps.

SELECT Name, GPA,
       ROW_NUMBER() OVER (ORDER BY GPA DESC) AS row_num,
       RANK() OVER (ORDER BY GPA DESC) AS rnk,
       DENSE_RANK() OVER (ORDER BY GPA DESC) AS dense_rnk
FROM student_grades;

In the accompanying image, the difference between the three functions is illustrated: Daniel receives rank 3 with DENSE_RANK but rank 4 with RANK.

8. Calculating Delta Values

To compare values across periods, use LAG() or LEAD(). Examples:

# Compare each month's sales to the previous month
SELECT month, sales,
       sales - LAG(sales, 1) OVER (ORDER BY month) AS month_over_month
FROM monthly_sales;

# Compare each month's sales to the same month last year
SELECT month, sales,
       sales - LAG(sales, 12) OVER (ORDER BY month) AS year_over_year
FROM monthly_sales;

9. Running Totals

Windowed SUM() computes cumulative totals.

SELECT Month,
       Revenue,
       SUM(Revenue) OVER (ORDER BY Month) AS Cumulative
FROM monthly_revenue;

10. Date/Time Manipulation

Common functions include DATE_ADD, DATE_SUB, and DATE_TRUNC. Example: find dates where the temperature is higher than the previous day.

SELECT a.Id
FROM Weather a, Weather b
WHERE a.Temperature > b.Temperature
  AND DATEDIFF(a.RecordDate, b.RecordDate) = 1;

These ten techniques equip SQL practitioners with the tools needed for complex data‑analysis tasks and interview challenges.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

SQLRankingMySQLDateTimeWindow FunctionsCTERecursive CTECASE WHENTemporary Function
Programmer XiaoFu
Written by

Programmer XiaoFu

xiaofucode.com – a programmer learning guide driven by the pursuit of profit

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.