Mastering MariaDB ColumnStore: Installation, Migration, Replication, and Performance Insights
This guide explains MariaDB ColumnStore’s columnar architecture, key advantages, step‑by‑step installation, field and SQL limitations, how to migrate InnoDB data, configure multi‑source replication, compare performance with InnoDB, and achieve high availability and dynamic scaling for analytical workloads.
MariaDB ColumnStore Overview
MariaDB ColumnStore extends MariaDB Server with a distributed column‑archive storage engine that provides massive parallel processing (MPP) for complex SQL queries and advanced analytics without requiring indexes.
Key Features and Advantages
Columnar Storage : Stores data by column, offering high compression ratios and fast analytical query performance.
Distributed Architecture : Horizontal scaling across multiple nodes increases processing power and throughput.
Parallel Query Execution : Utilises multi‑core CPUs and distributed resources to accelerate query response.
Data Compression : Efficient compression reduces storage needs and speeds up data loading.
Standard SQL Compatibility : Built on MariaDB/MySQL, it supports familiar SQL syntax and drivers.
Installation & Deployment
Configure a YUM repository for MariaDB 10.6 (e.g., Alibaba Cloud mirror):
shell> vim /etc/yum.repos.d/mariadb.repo
# MariaDB 10.6 CentOS repository list - created 2023-12-29 01:20 UTC
[mariadb]
name = MariaDB
baseurl = https://mirrors.aliyun.com/mariadb/yum/10.6/centos/$releasever/$basearch
gpgkey = https://mirrors.aliyun.com/mariadb/yum/RPM-GPG-KEY-MariaDB
gpgcheck = 1Install required packages and the ColumnStore engine:
shell> yum install epel-release
shell> yum install jemalloc
shell> yum install MariaDB-server MariaDB-backup MariaDB-shared MariaDB-client MariaDB-columnstore-engineKey /etc/my.cnf settings (excerpt):
[client-server]
port = 3306
socket = /tmp/mysql_mariadb.sock
[mysqld]
server-id = 133061
port = 3306
user = mysql
basedir = /usr
datadir = /data/mysql/columnstore/data
tmpdir = /data/mysql/columnstore/tmp
log-error = /data/mysql/columnstore/log/error.log
socket = /tmp/mysql_mariadb.sock
skip-slave-start
skip-external-locking
skip-name-resolve
sql_mode = ''
plugin-load-add = ha_columnstore.so
default-storage-engine = Columnstore
character-set-server = utf8mb4
collation-server = utf8mb4_general_ci
init_connect = 'set global character_set_database = "utf8mb4"'
slow_query_log = 1
slow_query_log_file = /data/mysql/columnstore/log/mysql-slow.log
log-slow-verbosity = query_plan,explain
long_query_time = 5
columnstore_replication_slave = 1
max_allowed_packet = 256MInitialize the data directory and start the server:
shell> mysql_install --defaults-file=/etc/my.cnf --user=mysql
shell> mysqld_safe --defaults-file=/etc/my.cnf --user=mysql &Important notes:
After startup, the SQL syntax behaves like InnoDB; any MySQL client (Navicat, Sqlyog, JDBC, etc.) works without changes.
ColumnStore data resides in /var/lib/columnstore/ and its configuration file is /etc/columnstore/Columnstore.xml.
Engine Usage Notes – Field Restrictions
VARCHAR maximum length 8000 (8000/3 for utf8, 8000/4 for utf8mb4).
BIT type not supported.
Reserved keywords (user, comment, match, key, update, status) cannot be used as identifiers.
ZEROFILL not supported.
ENUM type not supported.
COMMENT cannot contain single quotes.
PRIMARY KEY auto‑increment not supported.
SQL Limitations
All selected columns must appear in the GROUP BY clause.
MariaDB [test]> select id from t1 group by name;
ERROR 1815 (HY000): Internal error: MCS-2021: '`test`.`t1`.`id`' is not in GROUP BY clause.Correct usage:
MariaDB [test]> select name from t1 group by name; ALTER TABLEdoes not support multiple column actions or the AFTER clause.
MariaDB [test]> alter table t1 add age tinyint, add address varchar(100);
ERROR 1178 (42000): The storage engine for the table doesn't support Multiple actions in alter table statement.JOIN fails when column types differ (e.g., INT vs VARCHAR).
MariaDB [test]> select t1.id from t1 join t2 on t1.id=t2.cid;
ERROR 1815 (HY000): Internal error: IDB-1002: 't1' and 't2' have incompatible column type specified for join condition. ALTER TABLE ... CHANGE/MODIFYto change column type is not supported.
MariaDB [test]> alter table t1 change id id bigint;
ERROR 1815 (HY000): Internal error: CAL0001: Alter table Failed: Changing the datatype of a column is not supportedTable Creation Example
CREATE TABLE `sbtest` (
`id` int(10) unsigned NOT NULL,
`k` int(10) unsigned NOT NULL DEFAULT '0',
`c` char(120) DEFAULT '',
`pad` char(60) NOT NULL DEFAULT ''
) ENGINE=Columnstore DEFAULT CHARSET=utf8;Data Migration from InnoDB to ColumnStore
Export table structure only :
shell> /usr/local/mysql-5.7.42/bin/mysqldump -S /tmp/mysql_mysql57_1.sock test sbtest1 --compact -d -q --set-gtid-purged=OFF > ./sbtest.sqlConvert the DDL to ColumnStore syntax (example convert.sh using sed to replace InnoDB‑specific clauses, data types, and keywords). shell> ./convert.sh sbtest.sql sbtest_new.sql Export data with mydumper (multi‑threaded CSV) :
shell> yum install https://github.com/mydumper/mydumper/releases/download/v0.15.1-3/mydumper-0.15.1-3.el7.x86_64.rpm
shell> /usr/bin/mydumper -S /tmp/mysql_mysql57_1.sock --regex 'test.sbtest1' -t 16 --csv -v 3 --rows 10000000 --no-schemas -o ./Load the CSV into ColumnStore (single‑threaded to avoid lock contention):
shell> /usr/bin/myloader -S /tmp/mysql_mariadb.sock -t 1 -v 3 -B test -d ./Multi‑Source Replication Setup (InnoDB → ColumnStore)
Create a replication channel:
mysql> SET @@default_master_connection = ${connect_name};Configure the master connection using binlog position (GTID is not compatible):
mysql> CHANGE MASTER ${connect_name} TO MASTER_HOST='192.168.1.10', MASTER_USER='repl', MASTER_PASSWORD='repl', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000001', MASTER_LOG_POS=4, MASTER_CONNECT_RETRY=1;Start the slave (or all slaves):
mysql> START SLAVE ${connect_name};
mysql> START ALL SLAVES;Stop the slave when needed:
mysql> STOP SLAVE ${connect_name};
mysql> STOP ALL SLAVES;Check replication status:
mysql> SHOW SLAVE ${connect_name} STATUS;
mysql> SHOW ALL SLAVES STATUS;Reset replication and flush relay logs:
mysql> RESET SLAVE ${connect_name} ALL;
mysql> FLUSH RELAY LOGS ${connect_name};Important notes:
The MySQL replica must use binlog_format = STATEMENT; ROW format triggers a ColumnStore bug.
If column‑type mismatches occur (e.g., InnoDB TEXT not supported), set SET GLOBAL slave_type_conversions = ALL_LOSSY,ALL_NON_LOSSY; to force lossy conversion.
Performance Comparison – InnoDB vs ColumnStore
Sample test data is available at:
https://github.com/mariadb-corporation/mariadb-columnstore-sample-data
Benchmarks show that ColumnStore excels in analytical workloads, delivering higher throughput and lower storage consumption compared with InnoDB, which is optimized for transactional workloads.
High Availability & Dynamic Scaling
High availability can be achieved by deploying multiple ColumnStore nodes (as slaves) and using keepalived or HAProxy for fail‑over.
For dynamic storage expansion, mount the ColumnStore data directory (e.g., /var/lib/columnstore/data1) on a distributed file system such as GlusterFS (GFS).
Conclusion
InnoDB is optimized for transactional workloads with strong durability guarantees, while ColumnStore is built for analytical queries, delivering faster aggregation, higher compression, and lower disk usage. ColumnStore’s compatibility with MySQL clients and its native replication make it a practical choice for OLAP‑ready data warehousing without additional ETL layers.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
