Databases 6 min read

Understanding the Execution Order of SQL Queries

The article explains that SQL queries are not executed in the order written—SELECT is evaluated fifth after FROM, WHERE, GROUP BY, and HAVING—illustrates this logical sequence with diagrams, discusses alias handling, optimizer reordering, and shows comparable patterns in LINQ, pandas, and dplyr.

Java Captain
Java Captain
Java Captain
Understanding the Execution Order of SQL Queries

Many SQL statements start with SELECT , but the logical execution order is different. The author investigated the order and found that SELECT is actually evaluated fifth, after FROM , WHERE , GROUP BY , and HAVING .

The accompanying diagram clarifies where clauses such as WHERE , GROUP BY , window functions, ORDER BY , and LIMIT fit in the logical sequence, answering common questions like whether you can filter after a window function or use WHERE after GROUP BY .

Database engines may reorder operations for performance, so the strict logical order is not always followed during execution.

Column aliases can be confusing: a query like

SELECT CONCAT(first_name, ' ', last_name) AS full_name, count(*)
FROM table
GROUP BY full_name

appears to use the alias in GROUP BY , but the engine can rewrite it as

SELECT CONCAT(first_name, ' ', last_name) AS full_name, count(*)
FROM table
GROUP BY CONCAT(first_name, ' ', last_name)

showing that GROUP BY is still logically evaluated before SELECT . The engine also performs checks before generating an execution plan.

Optimizations can change the physical order. For example, in

SELECT * FROM owners
LEFT JOIN cats ON owners.id = cats.owner
WHERE cats.name = 'mr darcy'

filtering WHERE cats.name = 'mr darcy' before the left join can avoid processing unnecessary rows, while still producing the same result.

Similar logical ordering appears in other query languages. A LINQ example follows the FROM…WHERE…SELECT pattern:

var teenAgerStudent = from s in studentList
                      where s.Age > 12 && s.Age < 20
                      select s;

Pandas code often mirrors this order, though it is not required:

df = thing1.join(thing2)      # JOIN
df = df[df.created_at > 1000] # WHERE
df = df.groupby('something', num_yes = ('yes', 'sum')) # GROUP BY
df = df[df.num_yes > 2]       # HAVING
df = df[['num_yes', 'something1', 'something']] # SELECT
df.sort_values('sometthing', ascending=True)[:30] # ORDER BY and LIMIT
df[:30]

The article concludes that while the logical order helps understand query semantics, actual execution may differ due to engine optimizations. The original post can be found at the provided link.

optimizationSQLquery executiondatabaseswindow functions
Java Captain
Written by

Java Captain

Focused on Java technologies: SSM, the Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading; occasionally covers DevOps tools like Jenkins, Nexus, Docker, ELK; shares practical tech insights and is dedicated to full‑stack Java development.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.