Unveiling the Real Execution Order of SQL Queries (Why SELECT Isn’t First)
Discover the true execution sequence of SQL queries, why SELECT runs fifth, how window functions and GROUP BY interact, and the impact of engine optimizations, illustrated with diagrams and code examples spanning aliases, LINQ, and pandas.
SQL Query Execution Order
Many SQL queries start with SELECT, but the actual execution order is different. SELECT is executed fifth, after FROM, WHERE, GROUP BY, and HAVING. This order determines where window functions can be filtered and why they cannot appear in WHERE.
This diagram answers the following questions
Can WHERE be used after GROUP BY? (No, WHERE is before GROUP BY.)
Can the result of a window function be filtered? (No, window functions are part of SELECT, which runs after WHERE and GROUP BY.)
Can ORDER BY be based on GROUP BY columns? (Yes, ORDER BY is executed last.)
When is LIMIT executed? (At the very end.)
Database engines may not follow this strict order because they apply optimizations to execute queries faster.
So:
If you want to know whether a query is valid or what it returns, refer to the diagram.
The diagram is not suitable when dealing with query performance or index-related concerns.
Mixed Factors: Column Aliases
Many SQL implementations allow syntax like:
SELECT CONCAT(first_name, ' ', last_name) AS full_name, count(*) FROM table GROUP BY full_nameIt appears that GROUP BY runs after SELECT because it references the alias, but the engine can rewrite the query as:
SELECT CONCAT(first_name, ' ', last_name) AS full_name, count(*) FROM table GROUP BY CONCAT(first_name, ' ', last_name)Thus GROUP BY still executes first, and the engine performs checks before generating the execution plan.
Database May Not Follow This Order (Optimization)
In practice, engines may reorder JOIN, WHERE, and GROUP BY for performance, as long as the result does not change.
Example query:
SELECT * FROM owners LEFT JOIN cats ON owners.id = cats.owner WHERE cats.name = 'mr darcy'If we only need cats named “mr darcy”, filtering before the join is more efficient and does not alter the result.
The engine performs many other optimizations; I am not an expert, so I will not elaborate further.
LINQ Queries Start with FROM
LINQ in C# and VB.NET follows the FROM…WHERE…SELECT order. Example:
var teenAgerStudent = from s in studentList where s.Age > 12 && s.Age < 20 select s;Pandas queries are similar, though the order is not mandatory. Example:
df = thing1.join(thing2) # JOIN
df = df[df.created_at > 1000] # WHERE
df = df.groupby('something', num_yes = ('yes', 'sum')) # GROUP BY
df = df[df.num_yes > 2] # HAVING
df = df[['num_yes', 'something1', 'something']] # SELECT
df.sort_values('sometthing', ascending=True)[:30] # ORDER BY and LIMIT
df[:30]Writing code in this logical order improves readability and often matches how engines optimise queries.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Programmer DD
A tinkering programmer and author of "Spring Cloud Microservices in Action"
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
