Understanding the Execution Order of SQL Queries
This article explains why SQL queries do not start with SELECT, outlines the typical logical execution order of clauses such as FROM, WHERE, GROUP BY, HAVING, SELECT, ORDER BY, and LIMIT, and discusses how database engines may reorder operations for optimization, with code examples and comparisons to LINQ and pandas.
Many SQL statements begin with SELECT , but the logical execution order of a query is different; the SELECT clause is evaluated later, typically as the fifth step. The article lists the usual order: FROM , WHERE , GROUP BY , HAVING , SELECT , ORDER BY , and LIMIT .
A diagram (referenced in the original) helps answer questions such as whether WHERE can appear after GROUP BY (it cannot), whether window function results can be filtered (they cannot because window functions belong to the SELECT clause), and when ORDER BY and LIMIT are applied (at the end).
The article notes that database engines may not follow this exact sequence; they perform optimizations that reorder operations without changing the result set.
Column aliases and GROUP BY
Some SQL dialects allow using column aliases in GROUP BY . An example is shown:
SELECT CONCAT(first_name, ' ', last_name) AS full_name, count(*)
FROM table
GROUP BY full_nameAlthough it appears that GROUP BY runs after SELECT , the engine can rewrite the query so that GROUP BY is evaluated first, using the original expression instead of the alias.
Optimization and reordering
In practice, engines may reorder joins, filters, and aggregations for performance. For instance, in the query:
SELECT *
FROM owners LEFT JOIN cats ON owners.id = cats.owner
WHERE cats.name = 'mr darcy'the engine can apply the WHERE filter before performing the left join, avoiding unnecessary processing.
LINQ and pandas equivalents
LINQ queries follow a FROM … WHERE … SELECT order, illustrated with a C# example. Pandas code can be written in a similar logical sequence (JOIN → WHERE → GROUP BY → HAVING → SELECT → ORDER BY/LIMIT), though developers often reorder steps for performance.
The article also mentions that R's dplyr provides comparable abstractions for SQL‑like queries.
Original source: "SQL queries don’t start with SELECT" – https://jvns.ca/blog/2019/10/03/sql-queries-don-t-start-with-select/
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.