Understanding StarRocks Materialized View Refreshes and New Optimization Parameters
This article walks through StarRocks' materialized view refresh mechanisms, explains the various refresh triggers, details the refresh workflow, and introduces new parameters that allow selective refreshing of only changed data to avoid costly full‑partition refreshes.
Materialized View Basics
A materialized view in StarRocks pre‑computes the result of an expensive query and stores it in a separate table, allowing subsequent queries to read the cached data instead of re‑executing the original query.
Refresh Triggers
Manual refresh: REFRESH MATERIALIZED VIEW order_mv; Activate view: ALTER MATERIALIZED VIEW order_mv ACTIVE; Any change to a base table triggers a refresh.
Truncating a base table (e.g., TRUNCATE TABLE trunc_db.t1;) triggers a refresh.
Dropping a partition (e.g., ALTER TABLE table_name DROP PARTITION(p0, p1, ...);) triggers a refresh.
Earlier versions had a bug where TRUNCATE and DROP PARTITION did not trigger a refresh when the base table and the materialized view were in different databases; this has been fixed (see PR #52295 and #52618).
Refresh Workflow
When a refresh is triggered, StarRocks first determines which partitions need to be refreshed. The initial refresh does not contain an explicit time range; the system derives the range from query filters and, by default, processes only the first partition. The number of partitions processed per run can be adjusted with the partition_refresh_number parameter.
If additional partitions remain, the system recursively creates new refresh tasks for the remaining ranges until all required partitions are processed.
Problem Scenario: Mixed Partitioned and Non‑Partitioned Base Tables
Three base tables were created: two partitioned tables ( par_tbl1, par_tbl2) and one non‑partitioned table ( par_tbl3). A materialized view joining them was defined as follows:
CREATE MATERIALIZED VIEW test.mv_test
REFRESH ASYNC
PARTITION BY a_time
PROPERTIES ("excluded_trigger_tables" = "par_tbl3")
AS
SELECT date_trunc('day', a.datekey) AS a_time,
date_trunc('day', b.datekey) AS b_time,
date_trunc('day', c.datekey) AS c_time
FROM test.par_tbl1 a
LEFT JOIN test.par_tbl2 b ON a.datekey = b.datekey AND a.k1 = b.k1
LEFT JOIN test.par_tbl3 c ON a.k1 = c.k1;When both a partitioned table and the non‑partitioned table were updated, the view performed a full refresh of all partitions instead of only refreshing the changed data, causing unnecessary resource consumption.
Asynchronous refresh task status can be inspected with:
SELECT * FROM information_schema.task_runs ORDER BY create_time DESC;Solution: Adding excluded_refresh_tables
A new property excluded_refresh_tables was introduced to complement excluded_trigger_tables. The view definition becomes:
CREATE MATERIALIZED VIEW test.mv_test
REFRESH ASYNC
PARTITION BY a_time
PROPERTIES (
"excluded_trigger_tables" = "par_tbl3",
"excluded_refresh_tables" = "par_tbl3"
)
AS
SELECT date_trunc('day', a.datekey) AS a_time,
date_trunc('day', b.datekey) AS b_time,
date_trunc('day', c.datekey) AS c_time
FROM test.par_tbl1 a
LEFT JOIN test.par_tbl2 b ON a.datekey = b.datekey AND a.k1 = b.k1
LEFT JOIN test.par_tbl3 c ON a.k1 = c.k1;During a refresh, StarRocks checks whether any table listed in excluded_refresh_tables has changed. If a change is detected, the system computes the intersection of affected partitions instead of refreshing the entire partition set, avoiding full‑table scans for large base tables. This is especially useful when a joined table is a small dictionary table that does not require partitioning.
Configuration Parameter
The partition_refresh_number parameter controls how many partitions are processed in a single refresh task, allowing fine‑tuning of refresh granularity.
References
https://docs.starrocks.io/zh/docs/using_starrocks/async_mv/Materialized_view/#%E7%90%86%E8%A7%A3-starrocks-%E7%89%A9%E5%8C%96%E8%A7%86%E5%9B%BE
https://docs.starrocks.io/zh/docs/using_starrocks/async_mv/use_cases/data_modeling_with_materialized_views/#%E5%88%86%E5%8C%BA%E5%BB%BA%E6%A8%A1
https://github.com/StarRocks/starrocks/pull/52295
https://github.com/StarRocks/starrocks/pull/52618
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Sohu Tech Products
A knowledge-sharing platform for Sohu's technology products. As a leading Chinese internet brand with media, video, search, and gaming services and over 700 million users, Sohu continuously drives tech innovation and practice. We’ll share practical insights and tech news here.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
