Doubling SQL Speed: From Execution Plans to Business Logic Overhaul
A real‑world case study shows how a seemingly simple intersect query that took 20 seconds was optimized to 10 seconds by analyzing execution plans, rewriting the SQL, exploring parallel execution, and ultimately redesigning the business logic with a new timestamp column to enable incremental checks.
1. Examine Execution Plan
The original SQL simply joins TM_TASK_T and TM_TASK_HIS_T to find intersecting rows, but its execution plan reveals that TM_TASK_T accesses only the indexed TASK_ID via an INDEX FAST FULL SCAN, meaning no table‑row look‑ups are needed.
Data volumes are large: the HIS table holds over 5 million rows, while the TASK table approaches 40 million rows, and a HASH JOIN is already the appropriate method.
From the plan, there appears to be little room for further tuning.
2. Equivalent SQL Rewrite
Recognizing that TASK_ID is the primary key in both tables, the query can be rewritten to first intersect the two primary‑key indexes and then fetch the remaining columns from the history table. The rewritten SQL reduces execution time from about 20 seconds to roughly 10 seconds, effectively doubling performance.
Side‑by‑side execution plans (shown in the images) confirm a significant reduction in I/O reads after the rewrite.
3. Technical Solution Adjustment
Since the ten similar queries are independent, they could be executed in parallel, turning the total runtime from a sum of 10 × ~12 seconds (exceeding a 120 second timeout) into the maximum of a single query (<10 seconds). However, developers objected because parallel execution would require manual triggering of ten separate tasks, harming user experience and maintainability.
4. Re‑examine the Original Requirement
The core requirement is to ensure uniqueness between the current and historical tables. The process currently writes data to the history table before cleaning the current table, and a scheduled job periodically checks for duplicates. When duplicates are found, a manual task is launched to clean them.
5. Explore Incremental Detection
Attempting to process only incremental data (i.e., rows added since the last run) could dramatically improve performance. The natural candidates for incremental markers are CREATION_DATE and LAST_UPDATE_DATE in the history table, but neither is reliable: LAST_UPDATE_DATE does not change when rows are moved to history, and CREATION_DATE suffers the same issue.
Consequently, the team proposes adding a new column to the history table that records the exact time a row is inserted, defaulting to SYSDATE. This column would serve both audit purposes and incremental detection.
Developers initially resisted because adding the column would require code changes, but the impact can be mitigated by setting a default value, avoiding further application modifications.
6. Summary
This optimization journey illustrates a full‑path approach: start with execution‑plan analysis, then rewrite equivalent SQL, adjust the technical solution (parallelism), and finally refine the business logic by adding a timestamp column for incremental processing. When SQL‑level tweaks alone cannot meet performance goals, broader architectural and process changes become necessary.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
