Databases 7 min read

Cutting Oracle RAC CPU Usage from 100% to 12 ms: A Step‑by‑Step SQL Tuning Case Study

During a morning peak on December 15, 2020, a customer's Oracle RAC cluster hit 100% CPU on both nodes, causing severe response delays, and after detailed session analysis and SQL rewrite the CPU load dropped dramatically, restoring normal service performance.

ITPUB
ITPUB
ITPUB
Cutting Oracle RAC CPU Usage from 100% to 12 ms: A Step‑by‑Step SQL Tuning Case Study

Problem Overview

On the morning of 2020‑12‑15 the zCloud monitoring system reported that both RAC nodes of an Oracle database reached 100 % CPU, causing a large backlog of sessions and slow application response.

Root Cause Analysis

Identify the OS user responsible for high CPU Running top showed the CPU consumption was dominated by user time. The top processes belonged to the oracle OS user, confirming that the database front‑end was the source.

Locate the offending SQL in the database Using the zCloud "Top 5 Wait Event" view, most sessions were waiting on ON CPU and latch:cache buffers chains . Filtering by this wait event isolated a single SQL ID cuk5cguanbaqt , which was executed repeatedly and caused latch contention. The SQL averaged more than two minutes per execution and ran 611 times within one hour, generating a flood of sessions waiting on the latch and driving CPU to its limit.

SQL Performance Bottleneck

Executing the SQL (with bind variables replaced) in sqlplus and examining the SQL Monitor revealed that the bottleneck was a MERGE JOIN. Both branches of the join performed about nine million row accesses, indicating excessive index and table scans.

Optimization Techniques

Reduce repeated index and table accesses Extract the two sub‑queries of the MERGE‑JOIN into a WITH clause. This forces each index and table to be accessed only once and clarifies the logical flow.

Eliminate unnecessary joins The original LEFT JOIN on order_id was redundant because order_id is the primary key of the two joined tables. Removing the join reduced one table association.

Filter the large table C early Table C returned millions of rows without any predicate. Adding an EXISTS condition that links C to O allows early filtering before the join with ORF , dramatically shrinking the intermediate result set. Statistics showed that C produced over 8.4 million rows (close to the full 9 million), while O after filtering produced fewer than 2 000 rows. The rewritten query therefore processes far fewer rows.

Optimized SQL

The final rewritten statement is shown in the image below:

Optimized SQL diagram
Optimized SQL diagram

Results

Before optimization the SQL took about 154 seconds per execution (average reported by zCloud ≈ 2 minutes). After the rewrite the same statement finishes in roughly 12 ms.

Logical reads dropped from over 110 million per execution to about 10 thousand, yielding a performance gain of more than 10 000×.

Performance before optimization
Performance before optimization
Performance after optimization
Performance after optimization

Original article: https://www.modb.pro/db/43085

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Performance OptimizationOracleSQL TuningRAC
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.