Why MySQL Subqueries Took Hours and How Indexes Fixed It
A MySQL 5.6 database with three large tables suffered a 30,000‑second query when searching for students scoring 100 in a specific subject; by analyzing the execution plan, adding indexes on filter columns, and rewriting the query as a join, execution time dropped to under a second.
The article describes a performance problem in a MySQL 5.6 database containing three tables: Course (100 rows), Student (70,000 rows), and SC (700,000 rows). The goal is to find students who scored 100 in the Chinese subject (c_id = 0).
Initial query using a sub‑query:
SELECT s.*
FROM Student s
WHERE s.s_id IN (
SELECT s_id
FROM SC sc
WHERE sc.c_id = 0 AND sc.score = 100
);took 30248.271 seconds . An EXPLAIN showed that MySQL performed full table scans (type = ALL) because no indexes were used on the SC table.
First Optimization – Adding Indexes on Filter Columns
CREATE INDEX sc_c_id_index ON SC(c_id);
CREATE INDEX sc_score_index ON SC(score);After creating these indexes, the same query ran in 1.054 seconds , a >30,000× speed‑up, confirming that proper indexing dramatically improves query performance.
Further Investigation – Join Performance
Even with the two indexes, the execution plan still showed inefficiencies. Adding an index on the join column: CREATE INDEX sc_s_id_index ON SC(s_id); and rewriting the query as an explicit join:
SELECT s.*
FROM Student s
INNER JOIN SC sc ON sc.s_id = s.s_id
WHERE sc.c_id = 0 AND sc.score = 100;produced a runtime of 0.057 seconds , but later runs became slower (≈1.07 s) due to MySQL’s optimizer choosing a different join order.
Understanding MySQL’s Optimizer
MySQL sometimes rewrites the sub‑query into an EXISTS clause and may execute the outer query before the inner one, leading to many unnecessary row scans. The article shows execution‑plan screenshots (kept as images) that illustrate the “type = ALL” scans and the effect of different index configurations.
Effective Rewrite – Filter First, Then Join
To force the optimizer to filter the SC table before joining, the author used a derived table:
SELECT s.*
FROM (
SELECT *
FROM SC sc
WHERE sc.c_id = 0 AND sc.score = 100
) t
INNER JOIN Student s ON t.s_id = s.s_id;This version executed in 0.001 seconds**, a 50× improvement over the previous join version, confirming that filtering first and then joining is the most efficient strategy for this data set.
Key Takeaways
Nested sub‑queries in MySQL can be extremely slow when they trigger full table scans.
Creating appropriate indexes on columns used in WHERE clauses and join conditions is essential.
Rewriting sub‑queries as joins (or using derived tables) often yields better performance.
Analyzing the EXPLAIN output is crucial for understanding and guiding optimizer behavior.
Images illustrating the original and optimized execution plans are retained to show the impact of each change.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
IT Architects Alliance
Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
