Databases 8 min read

Understanding StarRocks Materialized View Refreshes and New Optimization Parameters

This article walks through StarRocks' materialized view refresh mechanisms, explains the various refresh triggers, details the refresh workflow, and introduces new parameters that allow selective refreshing of only changed data to avoid costly full‑partition refreshes.

Sohu Tech Products
Sohu Tech Products
Sohu Tech Products
Understanding StarRocks Materialized View Refreshes and New Optimization Parameters

Materialized View Basics

A materialized view in StarRocks pre‑computes the result of an expensive query and stores it in a separate table, allowing subsequent queries to read the cached data instead of re‑executing the original query.

Refresh Triggers

Manual refresh: REFRESH MATERIALIZED VIEW order_mv; Activate view: ALTER MATERIALIZED VIEW order_mv ACTIVE; Any change to a base table triggers a refresh.

Truncating a base table (e.g., TRUNCATE TABLE trunc_db.t1;) triggers a refresh.

Dropping a partition (e.g., ALTER TABLE table_name DROP PARTITION(p0, p1, ...);) triggers a refresh.

Earlier versions had a bug where TRUNCATE and DROP PARTITION did not trigger a refresh when the base table and the materialized view were in different databases; this has been fixed (see PR #52295 and #52618).

Refresh Workflow

When a refresh is triggered, StarRocks first determines which partitions need to be refreshed. The initial refresh does not contain an explicit time range; the system derives the range from query filters and, by default, processes only the first partition. The number of partitions processed per run can be adjusted with the partition_refresh_number parameter.

If additional partitions remain, the system recursively creates new refresh tasks for the remaining ranges until all required partitions are processed.

Problem Scenario: Mixed Partitioned and Non‑Partitioned Base Tables

Three base tables were created: two partitioned tables ( par_tbl1, par_tbl2) and one non‑partitioned table ( par_tbl3). A materialized view joining them was defined as follows:

CREATE MATERIALIZED VIEW test.mv_test
REFRESH ASYNC
PARTITION BY a_time
PROPERTIES ("excluded_trigger_tables" = "par_tbl3")
AS
SELECT date_trunc('day', a.datekey) AS a_time,
       date_trunc('day', b.datekey) AS b_time,
       date_trunc('day', c.datekey) AS c_time
FROM test.par_tbl1 a
LEFT JOIN test.par_tbl2 b ON a.datekey = b.datekey AND a.k1 = b.k1
LEFT JOIN test.par_tbl3 c ON a.k1 = c.k1;

When both a partitioned table and the non‑partitioned table were updated, the view performed a full refresh of all partitions instead of only refreshing the changed data, causing unnecessary resource consumption.

Asynchronous refresh task status can be inspected with:

SELECT * FROM information_schema.task_runs ORDER BY create_time DESC;

Solution: Adding excluded_refresh_tables

A new property excluded_refresh_tables was introduced to complement excluded_trigger_tables. The view definition becomes:

CREATE MATERIALIZED VIEW test.mv_test
REFRESH ASYNC
PARTITION BY a_time
PROPERTIES (
  "excluded_trigger_tables" = "par_tbl3",
  "excluded_refresh_tables" = "par_tbl3"
)
AS
SELECT date_trunc('day', a.datekey) AS a_time,
       date_trunc('day', b.datekey) AS b_time,
       date_trunc('day', c.datekey) AS c_time
FROM test.par_tbl1 a
LEFT JOIN test.par_tbl2 b ON a.datekey = b.datekey AND a.k1 = b.k1
LEFT JOIN test.par_tbl3 c ON a.k1 = c.k1;

During a refresh, StarRocks checks whether any table listed in excluded_refresh_tables has changed. If a change is detected, the system computes the intersection of affected partitions instead of refreshing the entire partition set, avoiding full‑table scans for large base tables. This is especially useful when a joined table is a small dictionary table that does not require partitioning.

Configuration Parameter

The partition_refresh_number parameter controls how many partitions are processed in a single refresh task, allowing fine‑tuning of refresh granularity.

References

https://docs.starrocks.io/zh/docs/using_starrocks/async_mv/Materialized_view/#%E7%90%86%E8%A7%A3-starrocks-%E7%89%A9%E5%8C%96%E8%A7%86%E5%9B%BE

https://docs.starrocks.io/zh/docs/using_starrocks/async_mv/use_cases/data_modeling_with_materialized_views/#%E5%88%86%E5%8C%BA%E5%BB%BA%E6%A8%A1

https://github.com/StarRocks/starrocks/pull/52295

https://github.com/StarRocks/starrocks/pull/52618

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

optimizationSQLStarRocksrefresh()Partitionmaterialized view
Sohu Tech Products
Written by

Sohu Tech Products

A knowledge-sharing platform for Sohu's technology products. As a leading Chinese internet brand with media, video, search, and gaming services and over 700 million users, Sohu continuously drives tech innovation and practice. We’ll share practical insights and tech news here.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.