Understanding Apache Doris: Real‑Time Analytical Database Architecture and Data Modeling
This article introduces Apache Doris, a high‑performance, real‑time analytical database built on an MPP architecture, covering its simple FE/BE design, three data‑model types (Aggregate, Unique, Duplicate), partitioning and bucketing strategies, rollup tables, and the limitations and best practices of row‑level updates.
Architecture
Doris consists of two process types: Frontend (FE) and Backend (BE). FE stores cluster metadata, parses SQL, generates execution plans and returns results. It runs as a Leader, Followers (metadata HA) and optional Observers (read‑only query scaling). BE stores physical data, executes the physical plan distributedly and guarantees data reliability through configurable replication (default three copies). Doris uses the MySQL protocol, so any MySQL client, ODBC/JDBC driver can connect directly.
Data models
Doris supports three logical table models. Columns without an AggregationType are Key (dimension) columns; columns with an AggregationType are Value (metric) columns.
Aggregate model
CREATE TABLE IF NOT EXISTS aggregate_table (
`user_id` LARGEINT NOT NULL COMMENT "用户id",
`date` DATE NOT NULL COMMENT "数据灌入日期时间",
`city` VARCHAR(20) COMMENT "用户所在城市",
`age` SMALLINT COMMENT "用户年龄",
`sex` TINYINT COMMENT "用户性别",
`last_visit_date` DATETIME REPLACE DEFAULT "1970-01-01 00:00:00" COMMENT "用户最后一次访问时间",
`cost` BIGINT SUM DEFAULT "0" COMMENT "用户总消费",
`max_dwell_time` INT MAX DEFAULT "0" COMMENT "用户最大停留时间",
`min_dwell_time` INT MIN DEFAULT "99999" COMMENT "用户最小停留时间"
) AGGREGATE KEY(`user_id`,`date`,`city`,`age`,`sex`)
DISTRIBUTED BY HASH(`user_id`) BUCKETS 1
PROPERTIES ("replication_allocation" = "tag.location.default: 1");Rows with identical Key values are merged according to the defined AggregationType (SUM, REPLACE, MAX, MIN). Example insertion shows two rows sharing the same Key become a single aggregated row.
Unique model
CREATE TABLE IF NOT EXISTS unique_table (
`user_id` LARGEINT NOT NULL COMMENT "用户id",
`username` VARCHAR(50) NOT NULL COMMENT "用户昵称",
`city` VARCHAR(20) COMMENT "用户所在城市",
`age` SMALLINT COMMENT "用户年龄",
`sex` TINYINT COMMENT "用户性别",
`phone` LARGEINT COMMENT "用户电话",
`address` VARCHAR(500) COMMENT "用户地址",
`register_time` DATETIME COMMENT "用户注册时间"
) UNIQUE KEY(`user_id`,`username`)
DISTRIBUTED BY HASH(`user_id`) BUCKETS 1
PROPERTIES ("replication_allocation" = "tag.location.default: 1");The Unique model enforces primary‑key uniqueness and is equivalent to an Aggregate model with REPLACE aggregation on all Value columns.
Duplicate model
CREATE TABLE IF NOT EXISTS duplicate_table (
`timestamp` DATETIME NOT NULL COMMENT "日志时间",
`type` INT NOT NULL COMMENT "日志类型",
`error_code` INT COMMENT "错误码",
`error_msg` VARCHAR(1024) COMMENT "错误详细信息",
`op_id` BIGINT COMMENT "负责人id",
`op_time` DATETIME COMMENT "处理时间"
) DUPLICATE KEY(`timestamp`,`type`)
DISTRIBUTED BY HASH(`type`) BUCKETS 1
PROPERTIES ("replication_allocation" = "tag.location.default: 1");Data is stored exactly as inserted; no aggregation or primary‑key enforcement occurs.
Choosing a data model
Aggregate – best for pre‑aggregated reporting; poor for frequent COUNT(*) queries.
Unique – guarantees primary‑key uniqueness; cannot benefit from rollup pre‑aggregation.
Duplicate – suitable for ad‑hoc queries; no aggregation constraints.
Partitioning and bucketing
Doris stores data in horizontal shards called Tablets. Tablets belong to Partitions; a Partition contains many Tablets. Partitioning can be Range or List , and a Hash Bucket is applied inside each Partition.
Range partition example
CREATE TABLE example_db.example_range_tbl (
`user_id` LARGEINT NOT NULL COMMENT "用户id",
`date` DATE NOT NULL COMMENT "数据灌入日期时间",
`timestamp` DATETIME NOT NULL COMMENT "数据灌入的时间戳",
`city` VARCHAR(20) COMMENT "用户所在城市",
`age` SMALLINT COMMENT "用户年龄",
`sex` TINYINT COMMENT "用户性别",
`last_visit_date` DATETIME REPLACE DEFAULT "1970-01-01 00:00:00" COMMENT "用户最后一次访问时间",
`cost` BIGINT SUM DEFAULT "0" COMMENT "用户总消费",
`max_dwell_time` INT MAX DEFAULT "0" COMMENT "用户最大停留时间",
`min_dwell_time` INT MIN DEFAULT "99999" COMMENT "用户最小停留时间"
) ENGINE=OLAP
AGGREGATE KEY(`user_id`,`date`,`timestamp`,`city`,`age`,`sex`)
PARTITION BY RANGE(`date`)(
PARTITION `p201701` VALUES LESS THAN ("2017-02-01"),
PARTITION `p201702` VALUES LESS THAN ("2017-03-01"),
PARTITION `p201703` VALUES LESS THAN ("2017-04-01")
)
DISTRIBUTED BY HASH(`user_id`) BUCKETS 16
PROPERTIES (
"replication_num" = "3",
"storage_medium" = "SSD",
"storage_cooldown_time" = "2018-01-01 12:00:00"
);Range partitions can be multi‑column. Adding or dropping partitions creates or removes left‑closed/right‑open intervals; gaps (holes) reject data that fall into them.
List partition example
CREATE TABLE example_db.example_list_tbl (
`user_id` LARGEINT NOT NULL COMMENT "用户id",
`date` DATE NOT NULL COMMENT "数据灌入日期时间",
`timestamp` DATETIME NOT NULL COMMENT "数据灌入的时间戳",
`city` VARCHAR(20) COMMENT "用户所在城市",
`age` SMALLINT COMMENT "用户年龄",
`sex` TINYINT COMMENT "用户性别",
`last_visit_date` DATETIME REPLACE DEFAULT "1970-01-01 00:00:00" COMMENT "用户最后一次访问时间",
`cost` BIGINT SUM DEFAULT "0" COMMENT "用户总消费",
`max_dwell_time` INT MAX DEFAULT "0" COMMENT "用户最大停留时间",
`min_dwell_time` INT MIN DEFAULT "99999" COMMENT "用户最小停留时间"
) ENGINE=OLAP
AGGREGATE KEY(`user_id`,`date`,`timestamp`,`city`,`age`,`sex`)
PARTITION BY LIST(`city`)(
PARTITION `p_cn` VALUES IN ("Beijing","Shanghai","Hong Kong"),
PARTITION `p_usa` VALUES IN ("New York","San Francisco"),
PARTITION `p_jp` VALUES IN ("Tokyo")
)
DISTRIBUTED BY HASH(`user_id`) BUCKETS 16
PROPERTIES (
"replication_num" = "3",
"storage_medium" = "SSD",
"storage_cooldown_time" = "2018-01-01 12:00:00"
);List partitions also support multi‑column definitions. Adding or removing partitions updates the allowed enum values.
Partition & bucket recommendations
Tablet count = Partition count × Bucket count.
Prefer a total Tablet count slightly larger than the number of disks in the cluster.
Target Tablet size: 1 GB – 10 GB for balanced performance.
Bucket count is immutable once set; plan for future scaling when defining it.
Rollup tables
Rollup tables are materialized indexes built on a Base table. They store aggregated data at a coarser granularity and may have a different column order or a subset of columns.
Typical scenarios for creating a Rollup:
Base table has low aggregation degree because of high‑cardinality columns.
Base table’s prefix index cannot satisfy query patterns.
ALTER TABLE site_visit ADD ROLLUP rollup_city(city, pv);Data updates
Row‑level UPDATE is supported from Doris version 0.15.x onward, but only on tables using the Unique model.
Update mechanism
The engine filters rows using the WHERE clause, replaces the Value columns of matching rows (leveraging Unique’s REPLACE semantics), and re‑inserts the modified rows. UPDATE is synchronous; the data becomes visible once the statement succeeds.
Performance considerations
More rows to update → slower execution.
Efficient WHERE predicates (indexed or partition‑pruned) greatly improve speed. Avoid using Value columns of a Unique model in the condition.
Concurrency control
By default, concurrent UPDATE s on the same table are prohibited to prevent nondeterministic results, because all Value columns are rewritten even if only one is changed. Users can enable unlimited concurrent updates by setting the FE configuration enable_concurrent_update = true.
Risks
If an UPDATE coincides with other DML operations (INSERT/DELETE) on the same rows, the final data may be uncertain. Clients must manage concurrency to avoid such conflicts.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Shepherd Advanced Notes
Dedicated to sharing advanced Java technical insights, daily work snippets, and the power of persistent effort.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
