Databases 19 min read

What Happens When a Single Bad SQL Slows Down an Entire Data Warehouse? Real Cases & Fixes

This article presents four real‑world Oracle SQL cases—from a Cartesian join that halted nightly reports to a varchar primary key causing full scans, non‑standard date syntax breaking partition pruning, and missing statistics leading to unstable plans—detailing analysis, execution‑plan insights, and concrete remediation steps.

dbaplus Community

Jan 3, 2022

What Happens When a Single Bad SQL Slows Down an Entire Data Warehouse? Real Cases & Fixes

Case 1 – Cartesian Join Causing Massive Runtime

Background : In a large e‑commerce data warehouse nightly jobs generate business reports. At 6 am a monitoring alert reported that reports expected to finish by 8 am were still running at 10 am. The DBA traced the slowdown to a newly added SQL that was being executed repeatedly.

Problem

The query joined two large tables without any join condition, producing a Cartesian product. The optimizer estimated 3505 TB of rows, 127 PB of data, a cost of 9890 G and a runtime of 999:59:59.

SELECT /*+ INDEX (A1 xxxxx) */ SUM(A2.CRKSL), SUM(A2.CRKSL*A2.DJ)
FROM xxxx A2, xxxx A1
WHERE A2.CRKFLAG = xxx
  AND A2.CDATE >= xxx
  AND A2.CDATE < xxx;

Resolution

Rewrite the statement with an explicit join predicate (e.g., JOIN ... ON A2.id = A1.id) or add the missing condition. After the fix the optimizer chose an indexed access path and the nightly jobs completed within the expected window.

Case 2 – VARCHAR2 Primary Key Prevents Index Range Scan

Background : An ERP system with a decade‑old schema stored massive data. A cleanup job that should delete < 10 % of rows based on the primary key used a VARCHAR2 primary key, causing the optimizer to perform full table scans.

Experiment

Two tables were created, each with 3.2 million rows:

CREATE TABLE t1 AS SELECT * FROM dba_objects WHERE 1=0;
ALTER TABLE t1 ADD id INT PRIMARY KEY;

CREATE TABLE t2 AS SELECT * FROM dba_objects WHERE 1=0;
ALTER TABLE t2 ADD id VARCHAR2(10) PRIMARY KEY;

-- Insert 3.2 M rows into each table (omitted for brevity)

Range query on the numeric key: SELECT * FROM t1 WHERE id >= 3199990; Result: 11 rows returned, plan shows INDEX RANGE SCAN.

Range query on the character key: SELECT * FROM t2 WHERE id >= '3199990'; Result: 755 565 rows returned, plan shows FULL TABLE SCAN. The character values are ordered lexicographically, not numerically, inflating the result set and causing a huge clustering factor.

Analysis

SELECT table_name, index_name, leaf_blocks, num_rows, clustering_factor
FROM user_indexes
WHERE table_name IN ('T1','T2');

TABLE_NAME  INDEX_NAME   LEAF_BLOCKS  NUM_ROWS  CLUSTERING_FACTOR
---------- ------------ ------------ --------- -----------------
T1         SYS_C0025294        6275   3200000               31520
T2         SYS_C0025295       13271   3200000              632615

Solution

Use a numeric primary key, or rewrite the predicate to a closed range so that the index can be used efficiently:

SELECT * FROM t2 WHERE id BETWEEN '3199990' AND '3200000';

Plan now shows INDEX RANGE SCAN with only six rows processed.

Case 3 – Non‑Standard Date Literals Prevent Partition Pruning

Background : In the same data warehouse a newly added SQL used to_date(20120208,'yyyy-mm-dd') (numeric literal without quotes) and a UNION ALL to split logic. The optimizer could not recognise the date literals, so it scanned all partitions, leading to an eight‑hour job.

Original Query

SELECT ...
FROM   ...
WHERE (order_creation_date >= to_date(20120208,'yyyy-mm-dd')
       AND order_creation_date <  to_date(20120209,'yyyy-mm-dd'))
   OR (send_date >= to_date(20120208,'yyyy-mm-dd')
       AND send_date <  to_date(20120209,'yyyy-mm-dd'))
  AND NVL(a.bd_id,0) = 1;

The execution plan showed a full scan of all partitions (Pstart=1, Pstop=92).

Fix

Quote the date literals and use a standard format:

SELECT ...
FROM   ...
WHERE (order_creation_date >= TO_DATE('2012-02-08','YYYY-MM-DD')
       AND order_creation_date <  TO_DATE('2012-02-09','YYYY-MM-DD'))
   OR (send_date >= TO_DATE('2012-02-08','YYYY-MM-DD')
       AND send_date <  TO_DATE('2012-02-09','YYYY-MM-DD'))
  AND NVL(a.bd_id,0) = 1;

After the change the optimizer pruned to a single partition (Pstart=57, Pstop=57) and the runtime dropped from >8 h to ~8 min.

Case 4 – Missing Statistics on Partitioned Index Cause Unstable Plan

Background : At month‑end a batch of reports slowed dramatically. An investigation revealed that a key SQL switched between a cheap nested‑loop join and an expensive hash join because statistics for a partitioned index were missing.

Analysis

The optimizer chose a nested‑loop join with cost 0 due to absent statistics, which performed poorly on large partitions.

Resolution

Gather accurate statistics for the affected index (or its partition) using DBMS_STATS:

EXEC dbms_stats.gather_index_stats(
  ownname => 'XXX',
  indname => 'XXX',
  partname => 'PART_XXX',
  estimate_percent => 10);

After refreshing the statistics the optimizer selected the intended hash join and performance returned to normal.

Key Takeaways

A Cartesian join (missing join predicate) can explode row estimates and overload the system; always validate join conditions.

Choosing the correct data type for primary keys is critical. Numeric keys enable efficient range scans; character keys may cause lexical ordering issues and large clustering factors.

Use standard, quoted date literals in Oracle. Improper literals prevent partition pruning and can lead to full‑partition scans.

Up‑to‑date statistics are essential for the optimizer, especially for partitioned tables and indexes. Regularly gather statistics before high‑load windows such as month‑end.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Indexing sql-optimization Oracle Execution Plan

Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Case 1 – Cartesian Join Causing Massive Runtime

Problem

Resolution

Case 2 – VARCHAR2 Primary Key Prevents Index Range Scan

Experiment

Analysis

Solution

Case 3 – Non‑Standard Date Literals Prevent Partition Pruning

Original Query

Fix

Case 4 – Missing Statistics on Partitioned Index Cause Unstable Plan

Analysis

Resolution

Key Takeaways

dbaplus Community

How this landed with the community

Was this worth your time?

0 Comments

Case 1 – Cartesian Join Causing Massive Runtime

Case 2 – VARCHAR2 Primary Key Prevents Index Range Scan

Case 3 – Non‑Standard Date Literals Prevent Partition Pruning

Case 4 – Missing Statistics on Partitioned Index Cause Unstable Plan