Master StarRocks Deployment and Optimization: From Capacity Planning to Query Tuning
This comprehensive guide walks you through StarRocks production deployment, covering capacity planning, hardware specs, environment setup, table modeling, partitioning, bucketing, index selection, data import best practices, query optimization, monitoring, and resource‑group configuration, all backed by concrete SQL examples and configuration commands.
StarRocks is an open‑source MPP database designed for fast, unified lake‑warehouse analytics. The article provides a step‑by‑step checklist for building a production‑grade StarRocks cluster, from hardware sizing to ongoing monitoring.
1. Capacity Planning
Reference the official StarRocks cluster configuration recommendations. Suggested specs include:
FE node: 8 CPU × 32 GB RAM, data disk ≥200 GB (SSD preferred)
BE node: CPU‑to‑memory ratio 1:4, minimum 8 CPU × 32 GB + RAM; disk capacity ≈10 TB per node, each disk ≤2 TB, SSD/NVMe recommended (HDD must provide >150 MB/s throughput and >500 IOPS)
All nodes should be homogeneous to avoid the “bucket‑of‑wheels” effect.
2. Deployment Scheme
Minimum production cluster: 3 FE + 3 BE (FE and BE preferably deployed separately). If mixed, set mem_limit in be.conf to the remaining memory after subtracting FE usage (e.g., 34 GB on a 40 GB machine).
FE high‑availability: 1 leader + 2 followers; add Observer nodes for read scaling if needed.
All client connections must go through a load balancer (e.g., Nginx, HAProxy, F5).
3. Modeling Guidelines
Only UTF‑8 encoding is supported.
Column rename is not supported (future support planned).
VARCHAR max length: 1,048,576.
KEY columns cannot be FLOAT or DOUBLE.
Names (database, table, view, user, role) are case‑sensitive; column and partition names are case‑insensitive.
Primary‑key model: key length ≤128 bytes.
Model Selection
Detail model – keep raw rows.
Primary‑key model – for non‑null primary keys, write‑light, read‑heavy workloads.
Update model – when primary key may be null, write‑heavy.
Aggregate model – for pre‑aggregated data.
4. Sorting Columns and Prefix Indexes
For DUPLICATE/AGGREGATE/UNIQUE keys, sort columns are defined via PRIMARY KEY (pre‑3.0) or ORDER BY (3.0+). Prefix indexes are sparse indexes built on sort columns and must fit entirely in memory.
Choose 3‑5 frequently filtered columns as sort columns.
Prefix index length ≤36 bytes and ≤3 columns; cannot include FLOAT/DOUBLE.
CREATE TABLE site_access (
site_id BIGINT DEFAULT '10',
city_code INT,
site_name VARCHAR(50),
pv BIGINT DEFAULT '0'
) DUPLICATE KEY(site_id,city_code,site_name)
DISTRIBUTED BY HASH(site_id);Example query using only site_id and city_code can leverage the prefix index to scan far fewer rows.
5. Partitioning Strategy
Time columns used frequently in WHERE should be partitioned.
Dynamic partitions for data‑expiration scenarios.
Each partition ≤100 GB; tables >50 GB or >5 M rows should be partitioned.
Supported partition types: range (time, numeric), list (string).
Maximum 1,024 partitions (adjustable via parameters).
6. Bucketing Choices
Production requires 3 replicas.
Estimate 1 GB per bucket; adjust based on expected data size (e.g., 10 GB raw data → ~10 buckets, then round up to number of BE nodes).
Static bucketing for non‑partitioned tables; avoid dynamic bucketing for partitioned tables with skewed data.
For joins on large tables (> KW rows), enable colocate join.
7. Field Types and Index Selection
Avoid NULL columns.
Use appropriate numeric or date types; store timestamps as DATE/DATETIME instead of VARCHAR for better Zonemap indexing.
Bitmap index: suitable for cardinality 10k‑100k, equality or IN queries; not supported for FLOAT/DOUBLE/BOOLEAN/DECIMAL.
Bloom filter index: suitable for very high cardinality (>100k) with low duplication; not supported for TINYINT/FLOAT/DOUBLE/DECIMAL.
8. Data Import Recommendations
Never use INSERT INTO VALUES() in production.
Batch inserts should be spaced ≥5 seconds (batch‑write) for real‑time scenarios.
Enable index spill for primary‑key updates; enable general spill for heavy ETL ( INSERT INTO SELECT).
Prefer TRUNCATE over DELETE for bulk deletions.
Full‑row UPDATE is only allowed on versions ≥3.0 and should be throttled (minutes between updates).
Deletes must include a WHERE clause and should be executed in batches (e.g., DELETE FROM tbl WHERE id IN (1,2,3,…)).
Drop operations go to the FE recycle bin for 86,400 seconds (configurable via catalog_trash_expire_second); BE trash retention defaults to 259,200 seconds.
9. Query Optimization
Increase max_user_connections (e.g., to 1,000) for high‑concurrency workloads.
Enable Page Cache and Query Cache; adjust Page Cache limit (default 20 % of mem_limit).
Avoid SELECT *; explicitly list required columns.
Push down predicates to reduce full‑table scans (e.g., WHERE id=123 or WHERE dt>'2024-01-01').
Use pagination with ORDER BY to limit result size.
Avoid unnecessary functions in predicates; replace casts with direct comparisons.
Join best practices: matching column types, no functions on join keys, use colocate join for large joins, avoid Cartesian products.
Materialized view example for UV counting:
CREATE MATERIALIZED VIEW advertiser_uv AS
SELECT advertiser, channel, bitmap_union(to_bitmap(user_id))
FROM advertiser_view_record
GROUP BY advertiser, channel;The query COUNT(DISTINCT user_id) is automatically rewritten to bitmap_union_count(to_bitmap(user_id)).
Asynchronous materialized views support up to three nesting levels.
10. Monitoring and Resource Isolation
Enable the audit plugin to write fe.audit.log into a table for slow‑query analysis.
Deploy Prometheus + Grafana for metrics collection (see official docs).
Define resource groups to isolate short and big queries:
# short query group
CREATE RESOURCE GROUP shortquery_group TO (
user='rg1_user1', role='rg1_role1', db='db1', query_type in ('select'), source_ip='192.168.x.x/24'
) WITH (
'type'='short_query',
'cpu_core_limit'='10',
'mem_limit'='20%'
);
# big query group
CREATE RESOURCE GROUP bigquery_group TO (
user='rg1_user2', role='rg1_role1', query_type in ('select')
) WITH (
'type'='normal',
'cpu_core_limit'='10',
'mem_limit'='20%',
'big_query_cpu_second_limit'='100',
'big_query_scan_rows_limit'='100000',
'big_query_mem_limit'='1073741824'
);Inspect running queries with SHOW PROC '/current_queries' and per‑host usage with SHOW PROC '/current_queries/${query_id}/hosts'. Important columns include QueryId, ScanBytes, ProcessRows, CPUCostSeconds, MemoryUsageBytes, ExecTime.
References
https://forum.mirrorship.cn/t/topic/567
https://docs.starrocks.io/zh/docs/deployment/environment_configurations/
https://docs.starrocks.io/zh/docs/using_starrocks/Colocate_join/
https://docs.starrocks.io/zh/docs/administration/management/audit_loader/
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
StarRocks
StarRocks is an open‑source project under the Linux Foundation, focused on building a high‑performance, scalable analytical database that enables enterprises to create an efficient, unified lake‑house paradigm. It is widely used across many industries worldwide, helping numerous companies enhance their data analytics capabilities.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
