How to Optimize Scalar Subqueries in Oracle SQL for Better Performance
This article explains the nature of scalar subqueries, identifies three common performance pitfalls, and provides practical rewrite techniques—including LEFT/INNER JOIN, MERGE, and WITH‑MATERIALIZE—illustrated with three real‑world Oracle SQL cases to dramatically reduce execution time.
1. Scalar Subqueries and Pseudocode
Scalar subqueries need to pass a value from the inner query to the outer query, causing the driven table to be scanned repeatedly—similar to a nested‑loop join. The main result set acts as the driver, while the scalar query is the driven table, executed once for each distinct value of the join column.
select ename, (select dname from dept d where d.deptno = e.deptno) dname
from emp e
where e.job in ('SALESMAN', 'ANALYST');In pseudocode the same logic can be expressed as:
for i in (select distinct deptno from emp e where e.job in ('SALESMAN', 'ANALYST')):
select dname from dept d where d.deptno = i;The number of executions of the scalar part can be calculated with a simple count query:
select count(distinct deptno) from emp e where e.job in ('SALESMAN', 'ANALYST');2. Performance Issues of Scalar Subqueries
Three typical bottlenecks arise when a scalar subquery is present:
Inefficient filtering of the driver table (e.g., full‑table scans or wrong indexes) increases the number of times the scalar part is invoked.
A large result set from the driver table means many distinct join‑column values, causing the scalar query to run many times even if it uses an INDEX UNIQUE SCAN.
Poor access paths for the scalar part itself (missing indexes, bad statistics) further degrade performance.
3. Common Optimization Strategies
When performance problems are traced to scalar subqueries, the following three approaches are most effective:
Rewrite the query using LEFT JOIN (or INNER JOIN when a primary‑foreign key relationship exists) to replace the scalar subquery with a set‑based join.
For UPDATE statements, replace the scalar subquery with a MERGE statement.
If rewriting is impossible, improve the scalar part’s access path by adding appropriate indexes or hints.
Rewrites give the optimizer control over the execution plan, allowing you to force hash joins or other efficient strategies.
4. Case Studies
Case 1 – SELECT with Scalar Subquery
The original reporting query on a data‑warehouse took about two minutes, returning roughly 1.6 million rows. The scalar part (a SELECT COUNT(1) on table dhoe) was executed once per row, becoming the bottleneck.
select t.num_id,
t.create_time,
1 if_3to4,
max(case
when (select count(1) from dhoe tt where tt.order_id = t.order_id and tt.otype = '101') > 0 then 0
when (select count(1) from dhoe tt where tt.order_id = t.order_id and tt.otype = '102') > 0 then 1
else 0 end) if_hk
from order t
where off_rype = '9601'
group by t.num_id, t.create_time;Rewritten with a LEFT JOIN that pre‑aggregates the scalar part:
select num_id,
create_time,
1 if_3to4,
max(case
when b.cnt101 > 0 then 0
when c.cnt102 > 0 then 1
else 0 end) if_hk
from (select t.num_id, t.create_time, 1 if_3to4, order_id
from order t
where off_rype = '9601') a
left outer join (
select order_id,
count(case when otype = '101' then 1 end) cnt101,
count(case when otype = '102' then 1 end) cnt102
from dhoe
group by order_id) b
on a.order_id = b.order_id
group by num_id, create_time;The optimizer now chooses a hash outer join, scanning each table only once and dramatically reducing I/O.
Case 2 – WHERE Clause Scalar Subquery
The original query refreshed a dashboard every 52 seconds, with the scalar subquery executed many times under a FILTER operator.
select count(*)
from wp_info ws
inner join wp_center wa on ...
where ws.status = 'VALID'
and wa.is_del = 'V'
and (select count(1)
from wp_bas wb
left join wp_rep wr on wr.id = wb.id
where wb.WP_STATUS in (2,3,4)
and wr.is_valid = 'VALID'
and wr.created > sysdate - 7) < ws.total_num;Rewritten using a LEFT JOIN with NVL to handle possible nulls:
select count(*)
from wp_info ws
inner join wp_center wa on ...
left join (
select wr.service_no, count(1) cnt
from wp_bas wb
left join wp_rep wr on wr.id = wb.id
where wb.WP_STATUS in (2,3,4)
and wr.is_valid = 'VALID'
and wr.created > sysdate - 7
group by wr.service_no) cc
on ws.num = cc.service_no
where ws.status = 'VALID'
and wa.is_del = 'V'
and nvl(cc.cnt,0) < ws.total_num;An alternative is to materialize the scalar calculation with a WITH ... MATERIALIZE clause, which simplifies the rewrite but gives less control over the execution plan.
Case 3 – UPDATE with Scalar Subquery
The original UPDATE statement took about 20 minutes, with the scalar subquery executed 257 times, as shown by the execution plan.
UPDATE RE_RPT A
SET A.TCNT = (
SELECT NVL(SUM(G.REDCODE),0)
FROM RP_GRANT G,
(SELECT DISTINCT REDCODE, REDCODE_MD5 FROM RP_SCAN) S,
IMT_CODE V,
(SELECT DISTINCT W_ID, NAME FROM MATER WHERE SUBSTR(ORDER,1,2)='OF' AND W_ID IS NOT NULL) M
WHERE G.REDCODE = S.REDCODE
AND S.REDCODE_MD5 = V.N_CODE
AND V.W_CODE = M.W_ID
AND G.RP_CLASS = '有效'
AND G.CREATE_DATE = '2018-05-05'
AND A.NAME = M.NAME)
WHERE A.CHOOSE_TIME = '2018-05-05';Rewritten as a MERGE that pre‑aggregates the scalar part and uses an outer‑join condition ( A.NAME = M.NAME(+)) to avoid repeated scans:
MERGE INTO (select * from RE_RPT A where A.CHOOSE_TIME = '2018-05-05') A
USING (
SELECT M.NAME, SUM(G.REDCODE) SUM_CODE
FROM RP_GRANT G,
(SELECT DISTINCT REDCODE, REDCODE_MD5 FROM RP_SCAN) S,
IMT_CODE V,
(SELECT DISTINCT W_ID, NAME FROM MATER WHERE SUBSTR(ORDER,1,2)='OF' AND W_ID IS NOT NULL) M
WHERE G.REDCODE = S.REDCODE
AND S.REDCODE_MD5 = V.N_CODE
AND V.W_CODE = M.W_ID
AND G.RP_CLASS = '有效'
AND G.CREATE_DATE = '2018-05-05'
GROUP BY M.NAME) M
ON (A.NAME = M.NAME(+))
WHEN MATCHED THEN UPDATE SET A.TCNT = NVL(SUM_CODE,0);5. Conclusion
The article introduced the characteristics of scalar subqueries, demonstrated three real‑world cases, and presented generic rewrite solutions. When a SQL statement’s performance bottleneck lies in a scalar subquery, rewriting it with joins or MERGE, or adding suitable indexes, reduces physical I/O and significantly improves execution speed.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
