Databases 6 min read

Understanding the Execution Order of SQL Queries

This article explains why SQL queries do not start with SELECT, outlines the typical logical execution order of clauses such as FROM, WHERE, GROUP BY, HAVING, SELECT, ORDER BY, and LIMIT, and discusses how database engines may reorder operations for optimization, with code examples and comparisons to LINQ and pandas.

Architecture Digest
Architecture Digest
Architecture Digest
Understanding the Execution Order of SQL Queries

Many SQL statements begin with SELECT , but the logical execution order of a query is different; the SELECT clause is evaluated later, typically as the fifth step. The article lists the usual order: FROM , WHERE , GROUP BY , HAVING , SELECT , ORDER BY , and LIMIT .

A diagram (referenced in the original) helps answer questions such as whether WHERE can appear after GROUP BY (it cannot), whether window function results can be filtered (they cannot because window functions belong to the SELECT clause), and when ORDER BY and LIMIT are applied (at the end).

The article notes that database engines may not follow this exact sequence; they perform optimizations that reorder operations without changing the result set.

Column aliases and GROUP BY

Some SQL dialects allow using column aliases in GROUP BY . An example is shown:

SELECT CONCAT(first_name, ' ', last_name) AS full_name, count(*)
FROM table
GROUP BY full_name

Although it appears that GROUP BY runs after SELECT , the engine can rewrite the query so that GROUP BY is evaluated first, using the original expression instead of the alias.

Optimization and reordering

In practice, engines may reorder joins, filters, and aggregations for performance. For instance, in the query:

SELECT *
FROM owners LEFT JOIN cats ON owners.id = cats.owner
WHERE cats.name = 'mr darcy'

the engine can apply the WHERE filter before performing the left join, avoiding unnecessary processing.

LINQ and pandas equivalents

LINQ queries follow a FROM … WHERE … SELECT order, illustrated with a C# example. Pandas code can be written in a similar logical sequence (JOIN → WHERE → GROUP BY → HAVING → SELECT → ORDER BY/LIMIT), though developers often reorder steps for performance.

The article also mentions that R's dplyr provides comparable abstractions for SQL‑like queries.

Original source: "SQL queries don’t start with SELECT" – https://jvns.ca/blog/2019/10/03/sql-queries-don-t-start-with-select/
OptimizationSQLquery executiondatabasespandaswindow functionsLINQ
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.