SQL Query Optimization: Sorting, GROUP BY, Pagination, and Index Strategies
This article explains how to improve MySQL query performance by optimizing ORDER BY, GROUP BY, and LIMIT operations, covering index usage, file‑sort algorithms, covering indexes, prefix indexes, and practical code examples for large‑scale data sets.
When dealing with massive data volumes, SQL query performance becomes critical for system response time and user experience. As database size grows, developers must master optimization techniques for sorting, grouping, and pagination.
1. Sorting Optimization
1.1 Problem
Even if an index exists on the WHERE column, an additional index may be needed on the ORDER BY column to avoid a full table sort.
1.2 Answer
MySQL supports two sorting methods: FileSort (CPU‑intensive, may create temporary files) and Index Sort (uses the index order, much faster).
Index sort guarantees data is already ordered, eliminating the need for an extra sort step.
FileSort performs the sort in memory or on disk, which is slower for large result sets.
1.3 Optimization Suggestions
Place indexes on both WHERE and ORDER BY clauses to avoid full scans and FileSort.
When the same columns appear in WHERE and ORDER BY , a single composite index can satisfy both.
If an index cannot be used, tune the FileSort algorithm.
1.4 Test Cases
Drop existing indexes on student and class tables:
#方式1:
DROP INDEX idx_monitor ON class;
DROP INDEX idx_cid ON student;
DROP INDEX idx_age ON student;DROP INDEX idx_name ON student ;
DROP INDEX idx_age_name_classid ON student ;DROP INDEX idx_age_classid_name ON student ;Run various EXPLAIN statements to observe using filesort , type=index , and index usage patterns for different ORDER BY and LIMIT combinations.
2. GROUP BY Optimization
GROUP BY follows the same left‑most prefix rule as ORDER BY; a suitable index can satisfy grouping without additional sorting.
Prefer placing filter conditions in WHERE rather than HAVING .
Avoid unnecessary ORDER BY, GROUP BY, or DISTINCT on large result sets.
3. Pagination Optimization
For deep offsets such as LIMIT 2000000,10 , MySQL must sort and discard millions of rows. Two strategies are recommended:
Use an index to retrieve the primary keys for the desired page, then join back to the table for remaining columns.
When the primary key is auto‑increment, replace the offset with a range condition (e.g., WHERE id>2080880 LIMIT 10 ), though this may be unreliable if IDs are deleted.
EXPLAIN SELECT * FROM student t, ( SELECT id FROM student ORDER BY id LIMIT 2000000,10) a WHERE t.id = a.id;4. Covering Indexes
A covering index contains all columns required by the query, allowing MySQL to satisfy the query using only the index (no table lookup).
Reduces I/O by avoiding the secondary‑index‑to‑primary‑key lookup.
Turns random I/O into sequential I/O.
Improves cache efficiency because the index is smaller than the full row.
However, maintaining extra indexes incurs storage and write‑cost overhead.
4.1 Example
CREATE INDEX idx_age_name ON student(age , NAME);
EXPLAIN SELECT id, age , NAME FROM student WHERE age <> 28;Adding an extra column (e.g., classid ) breaks the covering property, causing a table lookup.
4.2 Pros and Cons
Pros: eliminates back‑table lookups, improves I/O patterns, and reduces data volume read.
Cons: extra storage and maintenance cost; must balance against write workload.
5. String Indexing
When indexing VARCHAR columns, you can create a full‑length index or a prefix index (e.g., email(6) ). Prefix indexes save space but may prevent the index from being covering.
ALTER TABLE teacher ADD INDEX idx_email(email);
-- or
ALTER TABLE teacher ADD INDEX idx_email_prefix(email(6));Using a full‑length index allows MySQL to locate the row via the index and then fetch the full row with a single primary‑key lookup. A prefix index may require additional row checks because multiple rows share the same prefix.
5.1 Impact on Covering Indexes
Prefix indexes cannot serve as covering indexes for queries that need the full column value, so they sacrifice the performance benefit of covering indexes.
Overall, choose the index length based on selectivity and storage considerations.
Source: juejin.cn/post/7442342414533640233
Top Architecture Tech Stack
Sharing Java and Python tech insights, with occasional practical development tool tips.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.