Databases 8 min read

Parsing and Normalizing Space-Delimited Valid Periods in PostgreSQL

The article demonstrates how to split a space‑separated "validperiods" column into separate begin, end, and price fields in PostgreSQL using string_to_array, to_timestamp, regexp_replace, and unnest to transform the data into a readable tabular format.

Qunar Tech Salon
Qunar Tech Salon
Qunar Tech Salon
Parsing and Normalizing Space-Delimited Valid Periods in PostgreSQL

Many developers store formatted data in a single text column for convenience, such as a validperiods field that contains three values (start time, end time, price) separated by spaces. The article shows an initial query that retrieves this raw data:

qunar_group=# select id, validperiods from rrs limit 10;
   id    |        validperiods        
---------+----------------------------
 1522764 | 1363996800 1371945600 4512
 1522774 | 1363996800 1371945600 4941
 ... (more rows) ...
(10 rows)

Using string_to_array() the three components can be extracted into separate columns:

qunar_group=# select id,
       (string_to_array(validperiods, ' '))[1] as begin,
       (string_to_array(validperiods, ' '))[2] as end,
       (string_to_array(validperiods, ' '))[3] as price
  from rrs limit 10;
   id    |   begin    |    end     | price 
---------+------------+------------+-------
 1522764 | 1363996800 | 1371945600 | 4512
 ...

Converting the timestamps to readable dates improves clarity:

qunar_group=# select id,
       to_timestamp((string_to_array(validperiods, ' '))[1]::int) as begin,
       to_timestamp((string_to_array(validperiods, ' '))[2]::int) as end,
       (string_to_array(validperiods, ' '))[3] as price
  from rrs limit 10;
   id    |         begin          |          end           | price 
---------+------------------------+------------------------+-------
 1522764 | 2013-03-23 08:00:00+08 | 2013-06-23 08:00:00+08 | 4512
 ...

The problematic design appears when a row contains multiple "start‑end‑price" groups, all separated by spaces, making simple splitting impossible. An example query shows such a record:

qunar_group=# select id, validperiods from rrs where id = '2669032';
   id    |                                                              validperiods                                                               
---------+-----------------------------------------------------------------------------------------------------------------------------------------
 2669032 | 1363824000 1363824000 7080 1364688000 1364688000 7080 1365206400 1365206400 6580 ...

Because both groups and values use spaces as delimiters, the article proposes inserting commas between groups using regexp_replace() :

qunar_group=# select id,
       regexp_replace(validperiods, '([0-9]+[ ][0-9]+[ ][0-9]+)([ ])', '\1,', 'g')
  from rrs where id = '2669032';
   id    |                                                             regexp_replace                                                               
---------+-----------------------------------------------------------------------------------------------------------------------------------------
 2669032 | 1363824000 1363824000 7080,1364688000 1364688000 7080,1365206400 1365206400 6580, ...

After adding commas, the string can be split into individual groups with string_to_array() and expanded into rows using unnest() :

qunar_group=# select id,
       unnest(string_to_array(regexp_replace(validperiods, '([0-9]+[ ][0-9]+[ ][0-9]+)([ ])', '\1,', 'g'), ','))
  from rrs where id = '2669032';
   id    |           unnest           
---------+----------------------------
 2669032 | 1363824000 1363824000 7080
 2669032 | 1364688000 1364688000 7080
 2669032 | 1365206400 1365206400 6580
 ...

Finally, the groups are parsed again with string_to_array() and converted to timestamps for a clean, tabular result:

qunar_group=# select id,
       to_timestamp((string_to_array(validperiods, ' '))[1]::int) as begin,
       to_timestamp((string_to_array(validperiods, ' '))[2]::int) as end,
       (string_to_array(validperiods, ' '))[3] as price
  from (
        select id,
               unnest(string_to_array(regexp_replace(validperiods, '([0-9]+[ ][0-9]+[ ][0-9]+)([ ])', '\1,', 'g'), ',')) as validperiods
        from rrs where id = '2669032') a;
   id    |         begin          |          end           | price 
---------+------------------------+------------------------+-------
 2669032 | 2013-03-21 08:00:00+08 | 2013-03-21 08:00:00+08 | 7080
 2669032 | 2013-03-31 08:00:00+08 | 2013-03-31 08:00:00+08 | 7080
 ...

The author, a search and database researcher at Qunar, provides these examples to illustrate practical PostgreSQL techniques for handling poorly designed, space‑delimited data structures.

SQLPostgreSQLregexdata normalizationstring_to_array
Qunar Tech Salon
Written by

Qunar Tech Salon

Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.