Databases 10 min read

How to Shrink Oracle Indexes for Skewed Columns Using Function Indexes

This article explains why conventional indexes waste space and perform poorly on highly skewed columns, introduces a decode‑based function index that excludes high‑frequency values, details the experimental setup with millions of rows, compares index size and query performance, and outlines the method's limitations.

ITPUB

Dec 1, 2017

How to Shrink Oracle Indexes for Skewed Columns Using Function Indexes

When selecting columns for indexing, high selectivity and dispersion are ideal, but many real‑world tables exhibit severe value skew where a few values dominate most rows. In Oracle, the optimizer (CBO) often bypasses indexes for high‑frequency values, and the default index stores every distinct value, leading to large index structures and unnecessary I/O.

Proposed Solution

Use a function index that maps the high‑frequency values to NULL so they are omitted from the index. The DECODE function can implement this mapping, e.g.:

DECODE(secondary, 'S', NULL, 'J', NULL, 'T', NULL, secondary)

Oracle does not index NULL entries, so only low‑frequency values are indexed, resulting in a much smaller B‑tree.

Experimental Setup

A table t with about 4.8 million rows was created from dba_objects. The secondary column contains values S, T, J (over 99% of rows) and a few other values.

SQL> SELECT secondary, COUNT(*) FROM t GROUP BY secondary;
SECONDARY  COUNT(*)
---------- ---------
W                273
Q                  9
D                273
T              421230
J             1866592
E                 99
S              2470733

Index Creation

Two indexes were built on secondary:

SQL> CREATE INDEX IND_SEC_NORMAL ON t(secondary);
SQL> CREATE INDEX IND_T_FUN ON t(
  DECODE(secondary,'S',NULL,'J',NULL,'T',NULL,secondary));

The normal index occupied 75.5 MiB (80 extents, 9216 blocks) while the function index used only the initial allocation of 65 KiB (1 extent, 8 blocks).

Performance Comparison

Querying rows where secondary='W' using the normal index:

SQL> SELECT * FROM t WHERE secondary='W';
-- Execution time: 00:00:00.37
-- Cost: 11
-- Consistent gets: 272
-- Physical reads: 21

Using the function index:

SQL> SELECT * FROM t WHERE DECODE(secondary,'S',NULL,'J',NULL,'T',NULL,secondary)='W';
-- Execution time: 00:00:00.04
-- Cost: 116
-- Consistent gets: 140
-- Physical reads: 0

The function index reduced execution time by an order of magnitude and eliminated physical reads, though CPU cost increased because the decode expression must be evaluated for each row.

Statistics Gathering

SQL> EXEC dbms_stats.gather_table_stats(user, 'T', cascade=>TRUE,
       estimate_percent=>100, method_opt=>'FOR ALL INDEXED COLUMNS');

Conclusions and Limitations

The function index dramatically shrinks index size and speeds up queries that filter on high‑frequency values.

Benefits are noticeable only on large tables with strong value skew; small tables may not see a net gain.

Higher CPU usage is expected because the decode function is evaluated at query time.

The technique works best when the queried value appears frequently; for low‑frequency values the advantage diminishes.

Proper planning is required to ensure the table is large enough and the skew is significant before adopting this method.

Overall, applying a decode‑based function index is an effective way to handle column‑value skew in Oracle databases, reducing storage overhead and improving query performance for the most common values.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

SQL Performance Tuning Oracle Data Skew Function Index

Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.