Databases 9 min read

SQL Query Optimization Techniques

This article explains how to speed up SQL queries while keeping statements concise, covering the query processing workflow, optimization goals, and practical tips such as selecting specific columns, avoiding DISTINCT, using proper indexes, preferring EXISTS over COUNT, limiting result sets, favoring WHERE over HAVING, and replacing correlated subqueries with joins.

Selected Java Interview Questions

Jul 16, 2024

In this article we discuss how to improve SQL query speed while keeping the statements concise.

Before diving into the topic, we first understand the actual processing steps of a query.

1. Query Processing Steps

The query processing is defined as a series of steps to extract data from the database. It involves translating the SQL statement into a form the database can understand and then retrieving the final results.

Query processing involves three main steps:

1. Parsing and Translation: The process starts with parsing and translating the SQL. Similar to a compiler parser, it checks the query syntax and determines whether the referenced objects exist in the database. SQL, being a high‑level query language, must be translated into relational expressions.

2. Optimization: The same query can be written in many ways. The optimizer chooses the most efficient execution plan based on how data is stored and the relational expressions involved.

3. Execution Plan: The execution plan consists of a series of basic operations performed step‑by‑step to retrieve data. Different plans have different costs, such as disk I/O, CPU time, and, in distributed databases, communication time.

2. SQL Query Optimization

SQL query optimization is defined as the process of enhancing and accelerating query performance by reducing execution time, disk accesses, and other costs, aiming to retrieve data as quickly as possible to improve user experience.

The goals of SQL query optimization are:

1. Reduce response time: Minimize the time between a user request and the returned data to improve experience.

2. Reduce CPU execution time: Decrease the CPU time spent processing the query.

3. Increase throughput: Minimize the resources needed to fetch all required data.

3. Common SQL Query Optimization Tips

3.1 Use specific column names instead of SELECT *

Retrieve only the necessary data rather than all columns. For example: SELECT * FROM Business A more efficient query is: SELECT name, age, gender FROM Business This query is much simpler and extracts only the required details.

3.2 Avoid using DISTINCT in SELECT

SELECT DISTINCT

removes duplicate rows, but it can be costly. Using GROUP BY can achieve similar results, yet both require significant processing power. Therefore, avoid DISTINCT when possible.

3.3 Proper use of indexes

Correctly using indexes can reduce the execution time of common statements.

For example:

CREATE INDEX index_optimizer ON Business(id);

3.4 Check record existence with EXISTS instead of COUNT

EXISTS()

and COUNT() can both check for the presence of rows. EXISTS() is more efficient because it stops after finding the first matching row. COUNT() scans the entire table to return the number of matching rows.

Example: SELECT count(id) FROM Business A more efficient version:

EXISTS (SELECT id FROM Business)

3.5 Use LIMIT to restrict result set size

The fewer rows retrieved, the faster the query runs.

3.6 Prefer WHERE over HAVING

The HAVING clause filters rows after they have been selected, which is slower. Using WHERE filters rows earlier, resulting in faster execution.

For example:

SELECT c.ID, c.CompanyName, b.CreatedDate FROM Business b
JOIN Company c ON b.CompanyID = c.ID
GROUP BY c.ID, c.CompanyName, b.CreatedDate
HAVING b.CreatedDate BETWEEN '2020-01-01' AND '2020-12-31'

A more efficient rewrite:

SELECT c.ID, c.CompanyName, b.CreatedDate FROM Business b
JOIN Company c ON b.CompanyID = c.ID
WHERE b.CreatedDate BETWEEN '2020-01-01' AND '2020-12-31'
GROUP BY c.ID, c.CompanyName, b.CreatedDate

3.7 Avoid correlated subqueries

Correlated subqueries run once for each row of the outer query, which can be very slow. For example:

SELECT b.Name, b.Phone, b.Address, b.Zip,
       (SELECT CompanyName FROM Company WHERE ID = b.CompanyID) AS CompanyName
FROM Business b

Using a JOIN eliminates the performance penalty:

SELECT b.Name, b.Phone, b.Address, b.Zip, c.CompanyName
FROM Business b
JOIN Company c ON b.CompanyID = c.ID

4. Summary

This article introduced several techniques for optimizing SQL queries. The factor that usually has the greatest impact on query speed is the proper use of indexes. Hopefully the content helps you improve your database performance.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

SQL database Query Optimization Indexes

Written by

Selected Java Interview Questions

A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.