Databases 8 min read

How Oracle CBO Calculates Range Predicate Selectivity and Handles Strings

This article explains how Oracle's Cost‑Based Optimizer estimates the selectivity of range predicates without histograms, demonstrates calculations with sample data, shows PL/SQL code for converting string values to internal numbers, and discusses pitfalls such as identical internal values causing cardinality errors, offering solutions like gathering histograms or using DATE columns.

dbaplus Community
dbaplus Community
dbaplus Community
How Oracle CBO Calculates Range Predicate Selectivity and Handles Strings

Background

Oracle's Cost‑Based Optimizer (CBO) needs to estimate the selectivity of predicates to choose efficient execution plans. When a column lacks a histogram, the optimizer falls back to a formula based on the column's high and low values and the non‑null rate.

Formula for Range Predicate Selectivity (No Histogram)

For a predicate COL >= val the selectivity is calculated as:

((high_value - val) / (high_value - low_value)) * A4NULLS

where A4NULLS is the non‑null rate:

A4NULLS = (NUM_ROWS - NUM_NULLS) / NUM_ROWS

Experiment Setup

The following PL/SQL block creates a table t1 with 1,000 rows, inserting NULL values roughly every 30 rows to simulate missing data:

drop table t1;
create table t1 (id number);
declare
  vid number;
begin
  for i in 1..1000 loop
    if mod(i,30) = 0 then
      vid := null;
    else
      vid := i;
    end if;
    insert into t1 values(vid);
  end loop;
end;
exec dbms_stats.gather_table_stats(null,'T1');

After gathering statistics, querying user_tab_columns shows:

NUM_ROWS: 1000

LOW_VALUE: 1

HIGH_VALUE: 1000

NUM_NULLS: 33

HISTOGRAM: NONE

Applying the formula to the predicate ID >= 700:

A4NULLS = (1000‑33)/1000 = 0.967
Selectivity = (1000‑700+1)/(1000‑1) * 0.967 ≈ 0.291358358
Cardinality = 1000 * 0.291358358 ≈ 291

The optimizer therefore estimates about 291 rows, which matches the rounded cardinality of 291.

Converting String Values to Internal Numbers

Oracle cannot perform arithmetic on VARCHAR2 values directly. It converts them to an internal numeric representation using an algorithm exposed in the SQLT utility as get_internal_value. The following PL/SQL function reproduces that conversion:

CREATE OR REPLACE FUNCTION get_internal_value(p_value IN VARCHAR2)
  RETURN VARCHAR2 IS
  temp_n NUMBER := 0;
BEGIN
  FOR i IN 1..15 LOOP
    temp_n := temp_n + POWER(256, 15‑i) * ASCII(SUBSTR(RPAD(p_value,15,CHR(0)), i, 1));
  END LOOP;
  RETURN TO_CHAR(ROUND(temp_n, -21));
EXCEPTION
  WHEN OTHERS THEN
    RETURN p_value;
END get_internal_value;

Example:

SELECT get_internal_value('AAAAA') FROM dual;
-- Result: 338822822454670000000000000000000000

The CBO uses these internal values when evaluating >, <, BETWEEN predicates.

Impact on Optimizer Estimates

When string literals representing dates are converted, different timestamps can map to the same internal value. For example:

'2015-06-23 00:00:00' = 260592297225015000000000000000000000
'2015-09-21 23:59:59' = 260592297225015000000000000000000000

Because the internal values are identical, the optimizer treats the range as an equality, drastically under‑estimating cardinality. In a sample join between tables t1 and t2 with date range predicates, the CBO estimated only 1 row for t1 and 910 rows for t2, leading to a sub‑optimal nested‑loop plan.

Solutions

Collect histograms on the VARCHAR2 columns so the optimizer can use endpoint actual values for calibration.

Convert the columns to proper DATE data types, eliminating the need for string‑to‑number conversion.

After gathering histograms, the execution plan reflects the correct cardinalities, and the dba_histograms.endpoint_actual_value column is used to adjust duplicate internal values.

Execution plan after collecting histograms
Execution plan after collecting histograms
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

statisticsDatabase OptimizationOraclePL/SQLSelectivityCBO
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.