Databases 9 min read

Understanding Composite Data Types in ClickHouse: Array, Tuple, Enum, and Nested

This article explains ClickHouse's four composite data types—Array, Tuple, Enum, and Nested—detailing their definitions, type inference rules, practical SQL examples, and common pitfalls to help developers model complex data structures efficiently.

Big Data Technology & Architecture
Big Data Technology & Architecture
Big Data Technology & Architecture
Understanding Composite Data Types in ClickHouse: Array, Tuple, Enum, and Nested

ClickHouse is an open‑source OLAP database from Yandex known for its high performance, often outperforming traditional databases by orders of magnitude in benchmark tests.

Beyond its basic scalar types, ClickHouse provides four composite types—Array, Tuple, Enum, and Nested—that greatly enhance its data modeling capabilities.

Array can be defined using the array(T) function or the shorthand [T]. ClickHouse infers the element type automatically, preferring the smallest storage type; if any element is NULL, the array becomes Array(Nullable(...)). Example queries:

SELECT array(1, 2) AS a, toTypeName(a);
SELECT [1, 2];
SELECT [1, 2, NULL] AS a, toTypeName(a);
CREATE TABLE Array_TEST (c1 Array(String)) ENGINE = Memory;

Tuple groups 1‑n elements, each with its own type, and also supports type inference. It can be created with tuple(...) or the shorthand (...):

SELECT tuple(1, 'a', now()) AS x, toTypeName(x);
SELECT (1, 2.0, NULL) AS x, toTypeName(x);
CREATE TABLE Tuple_TEST (c1 Tuple(String, Int8)) ENGINE = Memory;

Enum provides a compact way to store a set of string constants as integers. ClickHouse offers Enum8 and Enum16 (mapping String:Int8 or String:Int16). Keys must be unique, values cannot be NULL, and inserts use the key string:

CREATE TABLE Enum_TEST (
    c1 Enum8('ready' = 1, 'start' = 2, 'success' = 3, 'error' = 4)
) ENGINE = Memory;
INSERT INTO Enum_TEST VALUES('ready');
INSERT INTO Enum_TEST VALUES('stop');  -- throws error because key not defined

Nested represents a one‑level array‑based table structure. Each nested field is stored as an array, and rows may have differing array lengths, but arrays within the same row must align in size. Example definition and inserts:

CREATE TABLE nested_test (
    name String,
    age UInt8,
    dept Nested(id UInt8, name String)
) ENGINE = Memory;
INSERT INTO nested_test VALUES ('bruce', 30, [10000,10001,10002], ['研发部','技术支持中心','测试部']);
-- mismatched array sizes cause DB::Exception

When querying nested data, use dot notation, e.g., SELECT name, dept.id, dept.name FROM nested_test; The article concludes with author information: Zhu Kai, a ClickHouse contributor and senior architect with extensive experience in big‑data platforms, and mentions his book that comprehensively covers ClickHouse architecture and practice.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

databaseenumClickHouseArraytupleComposite Typesnested
Big Data Technology & Architecture
Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.