Databases 8 min read

Mastering Dimension Tables: Natural vs Surrogate Keys and Handling Slowly Changing Dimensions

This article explains the role of dimension tables in data warehouses, compares natural and surrogate keys, and details three strategies—overwrite, insert new rows, and add new attributes—for managing slowly changing dimensions, including full‑snapshot tables.

Ma Wei Says
Ma Wei Says
Ma Wei Says
Mastering Dimension Tables: Natural vs Surrogate Keys and Handling Slowly Changing Dimensions

In a data warehouse, dimension tables (lookup tables) store descriptive attributes that define the business context of facts, such as date, product, region, or user. Fact tables contain measurable metrics (e.g., sales amount, quantity) and reference dimension rows through foreign keys.

Key Characteristics of Dimension Tables

Each dimension has a single primary key that serves as a foreign key in related fact tables.

Rows in a dimension describe the exact environment of corresponding fact rows.

Dimensions are typically wide, flat, and denormalized, containing many low‑granularity textual attributes.

Attributes define grouping, filtering, and constraint logic for BI queries.

Report labels are usually the domain values of dimension attributes.

Fact and dimension table relationship
Fact and dimension table relationship

Natural Key vs Surrogate Key

Natural key is a unique identifier that already exists in the source system (e.g., national ID, customer code). It carries business meaning, facilitating integration and traceability.

Surrogate key is an artificial primary key generated during modeling, usually an auto‑incrementing integer or a distributed ID. It does not reflect any real‑world attribute, providing a stable, compact identifier that improves join performance and handles changes in natural keys.

Example: When merging online and offline customer data, both sources use a customer ID as the natural key. To avoid key collisions, a new integer column customer_key can be introduced as a surrogate key for the unified customer dimension.

Handling Slowly Changing Dimensions (SCD)

Dimension attributes may evolve over time. Preserving historical states is essential for accurate analytics. Three common SCD techniques are:

Type 1 – Overwrite

Replace the old attribute value with the new one, keeping only the latest state. This method is simple and fast but discards historical information, leading to inaccurate trend analysis.

Overwrite example
Overwrite example

Type 2 – Insert New Row

Insert a new dimension row for each change and use a surrogate primary key. Add three supporting columns: effective_start_date (or timestamp) – when the row becomes valid. effective_end_date – when the row is superseded (null for the current row). is_current flag – indicates the active version.

Fact rows continue to reference the version that was valid at the time of the transaction, preserving history.

Insert new row example
Insert new row example

Type 3 – Add New Attribute

Add a separate column to store the previous value (e.g., previous_region). This approach works when changes are infrequent; otherwise the table can become unwieldy due to many added columns.

Add new attribute example
Add new attribute example

In practice, combinations of these techniques are often applied to meet specific business requirements.

Full‑Snapshot Dimension Table (Alternative)

An alternative is to take a complete snapshot of the dimension table at regular intervals (e.g., daily). This method does not rely on surrogate keys and is simple to implement, but it consumes more storage, especially when changes are rare.

Data WarehouseDatabase ModelingDimension TableSlowly Changing DimensionSurrogate Key
Ma Wei Says
Written by

Ma Wei Says

Follow me! Discussing software architecture and development, AIGC and AI Agents... Sometimes sharing insights on IT professionals' life experiences.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.