Optimizing a Slow MySQL Query with Derived Tables and UNION ALL
This article analyzes a slow MySQL aggregation query that joins a large table with a derived table built from many UNION ALL subqueries, explains the execution plan and derived‑table materialization, and presents two optimization approaches—adding taskname indexes and rewriting the query or using a temporary indexed table—to dramatically reduce execution time while preserving result semantics.
The problem is a statistical SQL query that aggregates data from bm_id joined with a derived table composed of dozens of UNION ALL subqueries, which runs for hours without finishing.
Original SQL (simplified): select name, count(name) from bm_id a left JOIN ( select TaskName from up_pro_accept_v3_bdc union all select TaskName from up_pro_accept_v3_hsjs ... union all select TaskName from up_pro_accept_v3_wygl ) t on a.zxi = t.TaskName group by name
The execution plan shows two main steps: (1) a sequential scan of each sub‑table to build the derived table t , each using full table scans; (2) a join between bm_id (driving table) and the derived table, also a full scan, which dominates the runtime.
In MySQL 5.7, derived tables are materialized into temporary tables. The optimizer can merge a derived table into the outer query when optimizer_switch='derived_merge=ON' , but this feature is disabled by many constructs such as UNION ALL , DISTINCT , GROUP BY , etc.
Optimization 1 – Index on taskname : Since every sub‑query selects TaskName , creating an index on the taskname column of each source table allows index scans instead of full scans.
Index creation example: CREATE INDEX idx_taskname ON up_pro_accept_v3_bdc(taskname); CREATE INDEX idx_taskname ON up_pro_accept_v3_hsjs(taskname);
After adding the indexes, the sub‑queries use index scans, but the derived table t is still scanned fully because the derived_merge feature cannot be applied due to the UNION ALL chain.
Optimization 2 – Query rewrite : Attempt to push the join into each sub‑query so the derived table can be merged. The rewritten SQL looks like:
SELECT name, count(name) FROM ( SELECT name FROM bm_id LEFT JOIN up_pro_accept_v3_bdc bdc ON bm_id.zxi = bdc.TaskName UNION ALL SELECT name FROM bm_id LEFT JOIN up_pro_accept_v3_hsjs hsjs ON bm_id.zxi = hsjs.TaskName ... UNION ALL SELECT name FROM bm_id LEFT JOIN up_pro_accept_v3_wygl wygl ON bm_id.zxi = wygl.TaskName ) t GROUP BY t.name;
Testing shows this rewrite reduces execution time to about 13 seconds, but it changes semantics: the original query performed a single LEFT JOIN with the derived table, while the rewritten version performs multiple LEFT JOINs, producing duplicate rows.
To keep semantics unchanged, a temporary table approach is used.
Temporary table creation: CREATE TABLE `tmp_up` ( `taskname` varchar(500) DEFAULT NULL, KEY `idx_taskname` (`taskname`) );
Insert all task names: INSERT INTO tmp_up SELECT taskname FROM up_pro_accept_v3_bdc UNION ALL SELECT taskname FROM up_pro_accept_v3_hsjs ...;
Final query using the temporary table: SELECT name, count(name) FROM bm_id a LEFT JOIN (SELECT taskname FROM tmp_up) t ON a.zxi = t.taskname GROUP BY name;
This version runs in roughly 13 seconds, uses index scans, and returns the same result set as the original query.
Summary of solutions: (1) Replace LEFT JOIN with INNER JOIN where possible for correctness and speed; (2) Use a temporary indexed table to replace the derived sub‑query, which is more cumbersome but preserves result semantics and offers fast execution.
References: MySQL Monthly – Derived Table Optimization CSDN article on derived tables iMySQL discussion MySQL 5.7 Reference Manual – Derived Table Optimization
Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.