Databases 28 min read

Understanding Apache Doris: Real‑Time Analytical Database Architecture and Data Modeling

This article introduces Apache Doris, a high‑performance, real‑time analytical database built on an MPP architecture, covering its simple FE/BE design, three data‑model types (Aggregate, Unique, Duplicate), partitioning and bucketing strategies, rollup tables, and the limitations and best practices of row‑level updates.

Shepherd Advanced Notes
Shepherd Advanced Notes
Shepherd Advanced Notes
Understanding Apache Doris: Real‑Time Analytical Database Architecture and Data Modeling

Architecture

Doris consists of two process types: Frontend (FE) and Backend (BE). FE stores cluster metadata, parses SQL, generates execution plans and returns results. It runs as a Leader, Followers (metadata HA) and optional Observers (read‑only query scaling). BE stores physical data, executes the physical plan distributedly and guarantees data reliability through configurable replication (default three copies). Doris uses the MySQL protocol, so any MySQL client, ODBC/JDBC driver can connect directly.

Data models

Doris supports three logical table models. Columns without an AggregationType are Key (dimension) columns; columns with an AggregationType are Value (metric) columns.

Aggregate model

CREATE TABLE IF NOT EXISTS aggregate_table (
    `user_id` LARGEINT NOT NULL COMMENT "用户id",
    `date` DATE NOT NULL COMMENT "数据灌入日期时间",
    `city` VARCHAR(20) COMMENT "用户所在城市",
    `age` SMALLINT COMMENT "用户年龄",
    `sex` TINYINT COMMENT "用户性别",
    `last_visit_date` DATETIME REPLACE DEFAULT "1970-01-01 00:00:00" COMMENT "用户最后一次访问时间",
    `cost` BIGINT SUM DEFAULT "0" COMMENT "用户总消费",
    `max_dwell_time` INT MAX DEFAULT "0" COMMENT "用户最大停留时间",
    `min_dwell_time` INT MIN DEFAULT "99999" COMMENT "用户最小停留时间"
) AGGREGATE KEY(`user_id`,`date`,`city`,`age`,`sex`)
DISTRIBUTED BY HASH(`user_id`) BUCKETS 1
PROPERTIES ("replication_allocation" = "tag.location.default: 1");

Rows with identical Key values are merged according to the defined AggregationType (SUM, REPLACE, MAX, MIN). Example insertion shows two rows sharing the same Key become a single aggregated row.

Unique model

CREATE TABLE IF NOT EXISTS unique_table (
    `user_id` LARGEINT NOT NULL COMMENT "用户id",
    `username` VARCHAR(50) NOT NULL COMMENT "用户昵称",
    `city` VARCHAR(20) COMMENT "用户所在城市",
    `age` SMALLINT COMMENT "用户年龄",
    `sex` TINYINT COMMENT "用户性别",
    `phone` LARGEINT COMMENT "用户电话",
    `address` VARCHAR(500) COMMENT "用户地址",
    `register_time` DATETIME COMMENT "用户注册时间"
) UNIQUE KEY(`user_id`,`username`)
DISTRIBUTED BY HASH(`user_id`) BUCKETS 1
PROPERTIES ("replication_allocation" = "tag.location.default: 1");

The Unique model enforces primary‑key uniqueness and is equivalent to an Aggregate model with REPLACE aggregation on all Value columns.

Duplicate model

CREATE TABLE IF NOT EXISTS duplicate_table (
    `timestamp` DATETIME NOT NULL COMMENT "日志时间",
    `type` INT NOT NULL COMMENT "日志类型",
    `error_code` INT COMMENT "错误码",
    `error_msg` VARCHAR(1024) COMMENT "错误详细信息",
    `op_id` BIGINT COMMENT "负责人id",
    `op_time` DATETIME COMMENT "处理时间"
) DUPLICATE KEY(`timestamp`,`type`)
DISTRIBUTED BY HASH(`type`) BUCKETS 1
PROPERTIES ("replication_allocation" = "tag.location.default: 1");

Data is stored exactly as inserted; no aggregation or primary‑key enforcement occurs.

Choosing a data model

Aggregate – best for pre‑aggregated reporting; poor for frequent COUNT(*) queries.

Unique – guarantees primary‑key uniqueness; cannot benefit from rollup pre‑aggregation.

Duplicate – suitable for ad‑hoc queries; no aggregation constraints.

Partitioning and bucketing

Doris stores data in horizontal shards called Tablets. Tablets belong to Partitions; a Partition contains many Tablets. Partitioning can be Range or List , and a Hash Bucket is applied inside each Partition.

Range partition example

CREATE TABLE example_db.example_range_tbl (
    `user_id` LARGEINT NOT NULL COMMENT "用户id",
    `date` DATE NOT NULL COMMENT "数据灌入日期时间",
    `timestamp` DATETIME NOT NULL COMMENT "数据灌入的时间戳",
    `city` VARCHAR(20) COMMENT "用户所在城市",
    `age` SMALLINT COMMENT "用户年龄",
    `sex` TINYINT COMMENT "用户性别",
    `last_visit_date` DATETIME REPLACE DEFAULT "1970-01-01 00:00:00" COMMENT "用户最后一次访问时间",
    `cost` BIGINT SUM DEFAULT "0" COMMENT "用户总消费",
    `max_dwell_time` INT MAX DEFAULT "0" COMMENT "用户最大停留时间",
    `min_dwell_time` INT MIN DEFAULT "99999" COMMENT "用户最小停留时间"
) ENGINE=OLAP
AGGREGATE KEY(`user_id`,`date`,`timestamp`,`city`,`age`,`sex`)
PARTITION BY RANGE(`date`)(
    PARTITION `p201701` VALUES LESS THAN ("2017-02-01"),
    PARTITION `p201702` VALUES LESS THAN ("2017-03-01"),
    PARTITION `p201703` VALUES LESS THAN ("2017-04-01")
)
DISTRIBUTED BY HASH(`user_id`) BUCKETS 16
PROPERTIES (
    "replication_num" = "3",
    "storage_medium" = "SSD",
    "storage_cooldown_time" = "2018-01-01 12:00:00"
);

Range partitions can be multi‑column. Adding or dropping partitions creates or removes left‑closed/right‑open intervals; gaps (holes) reject data that fall into them.

List partition example

CREATE TABLE example_db.example_list_tbl (
    `user_id` LARGEINT NOT NULL COMMENT "用户id",
    `date` DATE NOT NULL COMMENT "数据灌入日期时间",
    `timestamp` DATETIME NOT NULL COMMENT "数据灌入的时间戳",
    `city` VARCHAR(20) COMMENT "用户所在城市",
    `age` SMALLINT COMMENT "用户年龄",
    `sex` TINYINT COMMENT "用户性别",
    `last_visit_date` DATETIME REPLACE DEFAULT "1970-01-01 00:00:00" COMMENT "用户最后一次访问时间",
    `cost` BIGINT SUM DEFAULT "0" COMMENT "用户总消费",
    `max_dwell_time` INT MAX DEFAULT "0" COMMENT "用户最大停留时间",
    `min_dwell_time` INT MIN DEFAULT "99999" COMMENT "用户最小停留时间"
) ENGINE=OLAP
AGGREGATE KEY(`user_id`,`date`,`timestamp`,`city`,`age`,`sex`)
PARTITION BY LIST(`city`)(
    PARTITION `p_cn` VALUES IN ("Beijing","Shanghai","Hong Kong"),
    PARTITION `p_usa` VALUES IN ("New York","San Francisco"),
    PARTITION `p_jp` VALUES IN ("Tokyo")
)
DISTRIBUTED BY HASH(`user_id`) BUCKETS 16
PROPERTIES (
    "replication_num" = "3",
    "storage_medium" = "SSD",
    "storage_cooldown_time" = "2018-01-01 12:00:00"
);

List partitions also support multi‑column definitions. Adding or removing partitions updates the allowed enum values.

Partition & bucket recommendations

Tablet count = Partition count × Bucket count.

Prefer a total Tablet count slightly larger than the number of disks in the cluster.

Target Tablet size: 1 GB – 10 GB for balanced performance.

Bucket count is immutable once set; plan for future scaling when defining it.

Rollup tables

Rollup tables are materialized indexes built on a Base table. They store aggregated data at a coarser granularity and may have a different column order or a subset of columns.

Typical scenarios for creating a Rollup:

Base table has low aggregation degree because of high‑cardinality columns.

Base table’s prefix index cannot satisfy query patterns.

ALTER TABLE site_visit ADD ROLLUP rollup_city(city, pv);

Data updates

Row‑level UPDATE is supported from Doris version 0.15.x onward, but only on tables using the Unique model.

Update mechanism

The engine filters rows using the WHERE clause, replaces the Value columns of matching rows (leveraging Unique’s REPLACE semantics), and re‑inserts the modified rows. UPDATE is synchronous; the data becomes visible once the statement succeeds.

Performance considerations

More rows to update → slower execution.

Efficient WHERE predicates (indexed or partition‑pruned) greatly improve speed. Avoid using Value columns of a Unique model in the condition.

Concurrency control

By default, concurrent UPDATE s on the same table are prohibited to prevent nondeterministic results, because all Value columns are rewritten even if only one is changed. Users can enable unlimited concurrent updates by setting the FE configuration enable_concurrent_update = true.

Risks

If an UPDATE coincides with other DML operations (INSERT/DELETE) on the same rows, the final data may be uncertain. Clients must manage concurrency to avoid such conflicts.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

real-time analyticsdata modelingrolluppartitioningMPPApache Dorisrow update
Shepherd Advanced Notes
Written by

Shepherd Advanced Notes

Dedicated to sharing advanced Java technical insights, daily work snippets, and the power of persistent effort.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.