Databases 10 min read

Row vs. Column Store: When to Use Each and Why?

This article compares row‑based and column‑based storage models, analyzes their write and read performance, presents a practical openGauss experiment showing size and insert‑time differences, and offers concrete guidance on selecting the appropriate storage type for various workloads.

IT Architects Alliance
IT Architects Alliance
IT Architects Alliance
Row vs. Column Store: When to Use Each and Why?

Background

When learning databases, most start with relational databases that store data in rows (row‑based store). Some databases also support column‑based store, which stores each column separately on disk.

Storage Model Comparison

The following diagram illustrates the difference between the two models:

In a row store, the attribute values of a record are stored contiguously; in a column store, all values of a single attribute are stored together, making each column a continuous block of data.

Write‑Side Comparison

Row store writes a whole record in a single operation, guaranteeing atomicity and data integrity.

Column store must split a record into separate columns, resulting in more write operations and additional disk‑head movement, so write latency is higher.

Updates are essentially writes, so row store also has an advantage for modifications.

Read‑Side Comparison

Row store reads the entire row; when only a few columns are needed, extra columns are read and filtered in memory, causing redundancy.

Column store reads only the required columns, eliminating redundancy and making projection queries very efficient.

Because each column contains homogeneous data types, parsing is faster and compression works better.

Different columns can use different compression algorithms, improving I/O performance.

Pros & Cons Summary

Row Store

Advantages: data kept together, INSERT/UPDATE are simple.

Disadvantages: reading always accesses all columns, less suitable for point queries.

Applicable scenarios: point queries, frequent insert/update, tables with few columns where most fields are queried.

Column Store

Advantages: only required columns are read, projection is highly efficient, any column can be indexed, better compression, ideal for analytical (OLAP) workloads and wide tables where only a subset of columns is needed.

Disadvantages: higher write cost, does not support arrays, generated columns, global temporary tables, foreign keys, and supports fewer data types.

Applicable scenarios: analytical queries, large batch inserts, queries that touch a small fraction of columns, wide tables.

Experiment with openGauss

openGauss supports hybrid storage; you can specify the storage orientation when creating a table. The following experiment was performed on a Huawei Cloud server with openGauss Enterprise 3.0.0 and openEuler 20.03.

Two tables were created: a row‑store table custom1 and a column‑store table custom2. Each table was populated with 500,000 rows.

openGauss=# create table custom1 (id integer, name varchar2(20));
openGauss=# create table custom2 (id integer, name varchar2(20)) with (orientation = column);
openGauss=# insert into custom1 select n,'testtt'||n from generate_series(1,500000) n;
INSERT 0 500000
openGauss=# insert into custom2 select * from custom1;
INSERT 0 500000

Storage size comparison:

custom1 (row store)  → 24 MB
custom2 (column store) → 3 104 kB (≈1/7 of row store)

Insert performance comparison (single row):

explain analyze insert into custom1 values(1,'zhang3');
Total runtime: 0.135 ms
explain analyze insert into custom2 values(1,'zhang3');
Total runtime: 0.207 ms

The column‑store table occupies far less space, while its single‑row insert is slightly slower; however, column stores excel in bulk inserts.

After testing, the tables were dropped.

openGauss=# drop table custom1;
openGauss=# drop table custom2;

Selection Recommendations

Update frequency: if data is updated frequently, choose row store.

Insert frequency: frequent small inserts → row store; large batch inserts → column store.

Number of columns: wide tables with many columns but queries touching few columns → column store; tables with few columns where most fields are queried → row store.

Query column count: if each query accesses less than 50 % of columns, column store is preferable.

Compression ratio: column store offers higher compression but consumes more CPU.

Precautions

Column stores have many restrictions: they do not support arrays, generated columns, global temporary tables, foreign keys, and they support fewer data types. Always consult the specific database documentation before using column‑oriented tables.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performance comparisonDatabase StorageColumn StoreopenGaussRow Store
IT Architects Alliance
Written by

IT Architects Alliance

Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.