Analyzing and Optimizing Slow Scalar Subqueries in OceanBase 3.2.3.3
This article examines why a scalar subquery in an OceanBase 3.2.3.3 SQL statement takes over 1000 seconds, breaks down the execution plan, identifies the costly nested-loop behavior, and presents a rewrite using WITH and LEFT JOIN that reduces the cost from 788 million to 3.6 million and cuts runtime to about 10 seconds.
1 Problem Description
Database version: OceanBase 3.2.3.3. The following SQL statement runs for more than 1000 seconds. This article uses this example to discuss the reasons behind the slowness of scalar subqueries and the corresponding optimization methods.
select
rq.processinstid processinstid,
rq.question_id questionId,
rq.question_no questionNo,
to_char(rq.rev_start_date, 'yyyy-MM-dd') revStartDate,
(
select e.name
from e
where e.category_code = 'REV_SOURCE'
and e.code = rq.rev_source
) revSource,
(
select e.name
from e
where e.category_code = 'QUESTION_TYPE'
and e.code = rq.question_type
) questionType,
rq.question_summary questionSummary,
rq.question_desc questionDesc,
to_char(rq.question_discover_date, 'yyyy-MM-dd') questionDiscoverDate,
rq.aud_project_type audProjectType,
(
select d.dept_name
from d
where d.dept_id = rq.check_dept
) checkDept,
(
select to_char(wm_concat(distinct(k.org_name)))
from o, k
where o.question_id = rq.question_id
and o.ASC_ORG = k.org_id
and o.REFORM_TYPE = '0'
) ascOrg,
(
select to_char(wm_concat(distinct(k.dept_name)))
from o, fnd_dept_t k
where o.question_id = rq.question_id
and o.MAIN_REV_DEPT = k.dept_id
and o.REFORM_TYPE = '0'
) mainRevDept,
(
select e.name
from e
where e.category_code = 'REV_FINISH_STATE'
and e.code = rq.rev_finish_state
) revFinishState,
to_char(rq.compliance_date, 'yyyy-MM-dd') complianceDATE
from rq
left join REM_QUESTION_PLAN_T t on rq.question_id = t.question_id
left join fnd_org_t org on t.ASC_ORG = org.org_id
where 1 = 1
and rq.asc_org is null
and (
t.asc_org in (
select f.org_id
from f
where f.org_type = 'G'
)
or rq.created_by_org in (
select f.org_id
from f
where f.org_type = 'G'
)
)
and rq.company_type = 'G';2 Analysis Process
The execution plan is shown below:
===========================================================
|ID|OPERATOR |NAME |EST. ROWS|COST |
-----------------------------------------------------------
|0 |SUBPLAN FILTER | |6283 |788388847|
|1 | SUBPLAN FILTER | |6283 |1325483 |
|2 | HASH OUTER JOIN | |8377 |210530 |
|3 | TABLE SCAN |RQ |7966 |77932 |
|4 | TABLE SCAN |T |152919 |59150 |
|5 | TABLE SCAN |F |440 |2763 |
|6 | TABLE SCAN |F |440 |2763 |
|7 | TABLE SCAN |E(SYS_C0011218) |1 |92 |
|8 | TABLE SCAN |E(SYS_C0011218) |1 |92 |
|9 | TABLE GET |D |1 |46 |
|10| SCALAR GROUP BY | |1 |62483 |
|11| NESTED-LOOP JOIN| |1 |62483 |
|12| TABLE SCAN |O |1 |62468 |
|13| TABLE GET |K |1 |28 |
|14| SCALAR GROUP BY | |1 |62483 |
|15| NESTED-LOOP JOIN| |1 |62483 |
|16| TABLE SCAN |O |1 |62468 |
|17| TABLE GET |K |1 |27 |
|18| TABLE SCAN |E(SYS_C0011218) |1 |92 |
===========================================================Each sub‑operator has a low individual cost, but the total cost is huge. The SQL can be logically divided into two parts:
Scalar subquery – the sub‑queries in the projection list.
Outer query – the FROM clause with joins and additional sub‑queries.
The execution flow is therefore:
Execute the outer query first, producing result set r (operators 1‑6).
For each row of r , execute the scalar subqueries (operators 7‑18).
Repeat until all rows of r have been processed.
To pinpoint the bottleneck, the outer query (operators 1‑6) was run alone and returned 130,000 rows quickly, indicating that each scalar subquery is executed 130,000 times.
The scalar subqueries corresponding to operators 7, 8, 9 and 18 can use indexes and are fast; keeping only those reduces the total runtime dramatically.
The real problem lies with operators 10 and 14. Their subqueries join tables o and k inside the scalar subquery, causing the join to be performed 130,000 times.
The two problematic scalar subqueries are shown below (the highlighted part is the join that costs ~200 ms per execution):
select
xxx,
(
select to_char(wm_concat(distinct(k.org_name)))
from REM_QUESTION_PLAN_T o, fnd_org_t k
where o.question_id = rq.question_id
and o.ASC_ORG = k.org_id
and o.REFORM_TYPE = '0'
) ascOrg,
(
select to_char(wm_concat(distinct(k.dept_name)))
from REM_QUESTION_PLAN_T o, fnd_dept_t k
where o.question_id = rq.question_id
and o.MAIN_REV_DEPT = k.dept_id
and o.REFORM_TYPE = '0'
) mainRevDept,
xxx
from t -- outer query, 130k rows3 Conclusion
Scalar subqueries are executed as a SUBPLAN FILTER (equivalent to a NESTED‑LOOP JOIN ). Their performance depends on two factors:
The size of the outer query result set.
The efficiency of the subquery itself.
If the outer result set is large and the subquery contains additional joins without proper indexes, the nested‑loop execution becomes extremely slow. In such cases the subquery should be rewritten as a regular LEFT JOIN , allowing the optimizer to choose more efficient join algorithms such as HASH JOIN.
4 Optimization方案 (Optimization Plan)
Because the scalar subqueries contain aggregation, they should be aggregated first (using GROUP BY ) and then joined to the outer query. The rewritten SQL (partial) is:
with t1 as (
select o.question_id,
to_char(wm_concat(distinct(k.org_name))) as org_name
from REM_QUESTION_PLAN_T o, fnd_org_t k
where o.ASC_ORG = k.org_id
and o.REFORM_TYPE = '0'
group by o.question_id
),
t2 as (
select o.question_id,
to_char(wm_concat(distinct(k.dept_name))) as dept_name
from REM_QUESTION_PLAN_T o, fnd_dept_t k
where o.MAIN_REV_DEPT = k.dept_id
and o.REFORM_TYPE = '0'
group by o.question_id
)
select xxx,
t1.org_name as ascOrg,
t2.dept_name as mainRevDept,
xxx
from t -- outer query, 130k rows
left join t1 on t.question_id = t1.question_id
left join t2 on t.question_id = t2.question_id;After the rewrite the execution plan changes to use HASH OUTER JOIN . The total cost drops from 788 million to 3.6 million and the runtime falls from >1000 seconds to about 10 seconds.
=============================================================
|ID|OPERATOR |NAME |EST. ROWS|COST |
-------------------------------------------------------------
|0 |SUBPLAN FILTER | |6318 |3653489|
|1 | MERGE GROUP BY | |6318 |1636701|
|2 | SORT | |6318 |1632074|
|3 | SUBPLAN FILTER | |6318 |1613799|
|4 | HASH OUTER JOIN| |8424 |492531 |
|5 | HASH OUTER JOIN| |8377 |331672 |
|6 | MERGE OUTER JOIN| |7966 |198317 |
|7 | TABLE SCAN |RQ |7966 |77932 |
|8 | SUBPLAN SCAN|T2 |2351 |119098 |
|9 | MERGE GROUP BY| |2351 |119062 |
|10| SORT | |2352 |118658 |
|11| HASH JOIN| |2352 |113818 |
|12| TABLE SCAN|K |22268 |8614 |
|13| TABLE SCAN|O |76460 |60075 |
|14| TABLE SCAN|T |152919 |59150 |
|15| SUBPLAN SCAN|T1 |76415 |118014 |
|16| HASH JOIN | |76415 |116865 |
|17| TABLE SCAN|K |7033 |2721 |
|18| TABLE SCAN|O |76460 |60075 |
|19| TABLE SCAN|F |440 |2763 |
|20| TABLE SCAN|F |440 |2763 |
|21| TABLE SCAN|E(SYS_C0011218) |1 |92 |
|22| TABLE SCAN|E(SYS_C0011218) |1 |92 |
|23| TABLE GET |D |1 |46 |
|24| TABLE SCAN|E(SYS_C0011218) |1 |92 |
=============================================================Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
