Big Data 10 min read

Implementing Historical Slowly Changing Dimension (Chain) Tables with PL/pgSQL

This article explains the concept of historical chain (slowly changing dimension) tables in data warehousing, demonstrates how to create source and target tables, provides a PL/pgSQL stored procedure to handle inserts, updates, and deletions, and shows step‑by‑step testing with sample SQL scripts.

Big Data Technology & Architecture
Big Data Technology & Architecture
Big Data Technology & Architecture
Implementing Historical Slowly Changing Dimension (Chain) Tables with PL/pgSQL

Historical chain tables (also known as slowly changing dimension tables) are a data model used in data warehouse design to record all changes of a transaction from its start to the current state, avoiding massive storage by not storing a full snapshot for each day.

1. Concept

In a chain table each record has an effective date (sdate) and an expiration date (edate). For example, when two users are added on 2019‑10‑08, their effective dates are that day and their expiration dates are set to a maximum value (e.g., 2999‑12‑31) to indicate they are still active.

On the next day (2019‑10‑09) user 1001 is deleted and user 1002's phone number changes. To preserve history, user 1001's edate is set to 2019‑10‑09, and user 1002 gets two rows: one ending on 2019‑10‑09 and a new row starting on that day with the updated phone.

On the third day (2019‑10‑10) a new user 1003 is added, resulting in the table shown below.

To query the latest data, select rows where edate = '2999‑12‑31'. To query data as of a specific date, filter rows where sdate <= target_date and edate > target_date.

2. Table Creation

Temporary source table T_FIN_ACCTION_SRC receives data pushed from other databases (e.g., Oracle) and has the same structure as the source.

-- source table
create table T_FIN_ACCTION_SRC(
    eNo varchar(6),
    eName varchar(10),
    ePhone varchar(11),
    eData_date date
);

The target chain table T_FIN_ACCTION_TAR replaces the source date column with effective (sdate) and expiration (edate) dates.

-- chain table
create table T_FIN_ACCTION_TAR(
    eNo varchar(6),
    eName varchar(10),
    ePhone varchar(11),
    sdate date,
    edate date
);

3. Stored Procedure Creation

The following PL/pgSQL function processes the source data for a given day (or the previous day) and updates the chain table accordingly.

-- Function skeleton
create or replace function My_FIN_GL_SUBJECT_PRO(IN P_TODAY VARCHAR)
    returns void
as $$
declare
begin
    -- 1. Insert new records (no primary key in target)
    -- 2. Close records that disappeared in source (delete)
    -- 3. Update changed records:
    --    3.1 Close old record (set edate)
    --    3.2 Open new record (insert with new sdate, edate = max)
end;
$$ language plpgsql;

4. Implementation Details

1. Insert new rows when the primary key does not exist in the target.

insert into gplcydb.public.T_FIN_ACCTION_TAR(eNo,eName,ePhone,sdate,edate)
    select s.eNo,s.eName,s.ePhone,s.eData_date,to_date('2999-12-31','yyyy-mm-dd')
    from gplcydb.public.T_FIN_ACCTION_SRC s
    where s.eData_date = (to_date(P_TODAY,'yyyy-mm-dd') - 1)
      and not exists (
          select 1 from gplcydb.public.T_FIN_ACCTION_TAR t
          where s.eNo=t.eNo and s.eName=t.eName and s.ePhone=t.ePhone
      );

2. Close records that are missing in the source (deletion).

update gplcydb.public.T_FIN_ACCTION_TAR a set edate=(to_date(P_TODAY,'yyyy-mm-dd')-1)
    where not exists (
        select 1 from gplcydb.public.T_FIN_ACCTION_SRC s
        where s.eNo=a.eNo
          and a.edate=to_date('2999-12-31','yyyy-mm-dd')
    );

3.1 Close changed records.

update gplcydb.public.T_FIN_ACCTION_TAR b set edate=(to_date(P_TODAY,'yyyy-mm-dd')-1)
    where b.edate=to_date('2999-12-31','yyyy-mm-dd')
      and exists (
          select 1 from gplcydb.public.T_FIN_ACCTION_SRC s
          where s.eNo = b.eNo
            and b.sdate < (to_date(P_TODAY,'yyyy-mm-dd')-1)
            and (s.eName <> b.eName or s.ePhone <> b.ePhone)
      );

3.2 Open a new row for the changed data.

insert into gplcydb.public.T_FIN_ACCTION_TAR(eNo,eName,ePhone,sdate,edate)
    select s.eNo,s.eName,s.ePhone,(to_date(P_TODAY,'yyyy-mm-dd') - 1),to_date('2999-12-31','yyyy-mm-dd')
    from gplcydb.public.T_FIN_ACCTION_SRC s
    where s.eData_date = (to_date(P_TODAY,'yyyy-mm-dd') - 1)
      and exists (
          select 1 from (
              select eNo,sdate,max(edate) end_date
              from gplcydb.public.T_FIN_ACCTION_TAR
              group by eNo,sdate
          ) t
          where t.eNo=s.eNo
            and s.eData_date = t.sdate
            and t.end_date <= to_date(P_TODAY,'yyyy-mm-dd')
      );

5. Testing

Insert a full‑day snapshot into the source table:

insert into T_FIN_ACCTION_SRC values('1001','feiniu','18500000001','2019-10-10');
insert into T_FIN_ACCTION_SRC values('1002','beibei','18400000005','2019-10-10');
insert into T_FIN_ACCTION_SRC values('1003','yuyu','13800000005','2019-10-10');

Run the function for the next day and query the target table:

select My_FIN_GL_SUBJECT_PRO('2019-10-11');
select * from T_FIN_ACCTION_TAR;

Resulting chain table is shown in the following image.

Insert a second day of full data that includes deletions, additions, and updates, then run the function again:

delete from T_FIN_ACCTION_SRC where eno='1003';
insert into T_FIN_ACCTION_SRC values('1004','kongkong','13800000666','2019-10-11');
update T_FIN_ACCTION_SRC set ename='xiaofeifei' where eno='1001';
select * from T_FIN_ACCTION_SRC;

Execute the function:

select My_FIN_GL_SUBJECT_PRO('2019-10-12');
select * from T_FIN_ACCTION_TAR;

Final chain table image:

big dataSQLPL/pgSQLSlowly Changing Dimension
Big Data Technology & Architecture
Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.