Databases 15 min read

Introduction to StarRocks: Architecture, Storage, Use Cases, and Troubleshooting

StarRocks is a high‑performance MPP database whose simplified FE/BE architecture, fully vectorized engine, and CBO optimizer enable fast multi‑table joins, while its partition‑bucket‑tablet storage model supports real‑time metric services and dashboard migrations, accompanied by practical troubleshooting guidance and upcoming enhancements.

DeWu Technology
DeWu Technology
DeWu Technology
Introduction to StarRocks: Architecture, Storage, Use Cases, and Troubleshooting

StarRocks is a next‑generation, high‑performance MPP database designed for a wide range of analytical scenarios. It features a simplified architecture with a fully vectorized engine and a newly designed CBO optimizer, delivering fast query speeds especially for multi‑table joins.

The system consists of two core processes: Frontend (FE) and Backend (BE). FE manages metadata, client connections, query planning and scheduling. It follows a Paxos‑like BDBJE protocol to elect a Leader among Followers, while Observers replicate logs without participating in leader election. BE handles data storage and SQL execution, storing data as tablets with multiple replicas.

Data is partitioned into tablets, each replicated across BE nodes. StarRocks supports hash distribution and a recommended range‑hash hybrid distribution, allowing dynamic range partitions and fixed hash buckets. Proper selection of partition and bucket columns is crucial for performance.

Storage units include:

Partition: logical division for efficient pruning.

Bucket: hash‑based distribution to avoid hotspotting; bucket count should be moderate (e.g., BE count × CPU cores / 2) and tablet size around 1 GB.

Tablet: smallest logical data unit, enabling parallel processing across machines and cores.

Rowset: versioned set of files generated on each data change.

Segment: split of large Rowsets for on‑disk storage.

Case 1 – Metric Factory Service

The service collects real‑time business metrics for monitoring and alerting. Requirements include full‑log detail, real‑time updates, hierarchical aggregation, high write throughput, flexible data retention, and multi‑source integration.

StarRocks meets these needs with four data models (detail, aggregate, update, primary‑key) and a star schema for multi‑table joins. Real‑time ingestion is achieved via Kafka → Flink → StarRocks using the Flink connector.

Tips:

When write pressure is high, increase batch size but balance latency.

Avoid overly large Flink sink connections to prevent transaction overload.

Case 2 – Internal Dashboard

Originally built on MongoDB, the dashboard suffered from slow multi‑dimensional reports. After evaluating ClickHouse and StarRocks, StarRocks was chosen for its strong support of external dimension tables and complex SQL.

The migration split a MongoDB collection into three StarRocks tables using the detail model with daily partitions. Dimension tables were kept external to reduce synchronization complexity.

Performance comparison:

StarRocks: single SQL aggregation query completed in 295 ms.

MongoDB: required multiple queries and calculations, total time ~12 s.

Experience Sharing – Common Errors and Solutions

a. Stream Load error "current running txns on db 13003 is 100, larger than limit 100" – increase max_running_txn_num_per_db or batch submissions.

b. FE error "Too many open files" – raise the file‑descriptor limit (e.g., ulimit -n 65535).

c. Java UDF initialization failure – ensure JAVA_HOME is set for BE processes.

d. Delete statement limitation – only simple predicates (=, >, >=, <, <=, !=, IN, NOT IN) are supported; BETWEEN is not allowed.

e. Routine Load generating random group IDs – specify a fixed group name when creating the load.

f. Connection timeout with "no scanNode Backend" – a failing Kafka routine load can exhaust BRPC workers; pause the problematic task to recover.

Future Plans

The team intends to onboard more business workloads to StarRocks, replace existing OLAP engines, and contribute to the community. Planned improvements include reducing memory usage of primary‑key models, supporting partial column updates, enhancing bitmap query performance, and better multi‑tenant resource isolation.

db.time_note_new.aggregate(
    [
       {'$unwind': '$depart'},
       {'$match': {
           'depart': {'$in': ['部门id']},
           'workday': {'$gte': 1609430400, '$lt': 1646064000},
           'content.id': {'$in': ['事项id']}, 
           'vacate_state': {'$in': [0, 1]}}
       },
       {'$group': { 
           '_id': '$depart', 
           'write_hour': {'$sum': '$write_hour'}, 
           'code_count': {'$sum': '$code_count'}, 
           'all_hour': {'$sum': '$all_hour'}, 
           'count_day_user': {'$sum': {'$cond': [{'$eq': ['$vacate_state', 0]}, 1, 0]}}, 
           'vacate_hour': {'$sum': {'$cond': [{'$eq': ['$vacate_state', 0]}, '$all_hour', 0]}}, 
           'vacate_write_hour': {'$sum': {'$cond': [{'$eq': ['$vacate_state', 0]}, '$write_hour', 0]}}}
           -- ... more field
       }, 
       {'$project': {
           '_id': 1, 
           'write_hour': {'$cond': [{'$eq': ['$count_day_user', 0]}, 0, {'$divide': ['$vacate_write_hour', '$count_day_user']}]}, 
           'count_day_user': 1, 
           'vacate_hour': 1, 
           'vacate_write_hour': 1, 
           'code_count': {'$cond': [{'$eq': ['$count_day_user', 0]}, 0, {'$divide': ['$code_count', '$count_day_user']}]}, 
           'all_hour': {'$cond': [{'$eq': ['$count_day_user', 0]}, 0, {'$divide': ['$vacate_hour', '$count_day_user']}]}}
           -- ... more field
       }
    ]
)
WITH cont_time as (
    SELECT b.depart_id, a.user_id, a.workday, a.content_id, a.vacate_state
        min(a.content_second)/3600 AS content_hour,
        min(a.write_second)/3600 AS write_hour,
        min(a.all_second)/3600 AS all_hour
    FROM time_note_report AS a
    JOIN user_department AS b ON a.user_id = b.user_id
    -- 更多维表关联
    WHERE b.depart_id IN (?) AND a.content_id IN (?)
      AND a.workday >= '2021-01-01' AND a.workday < '2022-03-31'
      AND a.vacate_state IN (0, 1)
    GROUP BY b.depart_id, a.user_id, a.workday, a.content_id, a.vacate_state
)
SELECT M.*, N.*
FROM (
    SELECT t.depart_id,
         SUM(IF(t.content_id = 14, t.content_hour, 0))   AS content_hour_14,
         SUM(IF(t.content_id = 46, t.content_hour, 0))   AS content_hour_46,
         -- ...more
    FROM cont_time t
    GROUP BY t.depart_id
) M
JOIN (
    SELECT depart_id AS join_depart_id,
      SUM(write_hour) AS write_hour,
      SUM(all_hour)   AS all_hour
    FROM cont_time
    GROUP BY depart_id
) N ON M.depart_id = N.join_depart_id
ORDER BY depart_id ASC
PerformanceSQLReal-time analyticsStarRocksData WarehouseMPP database
DeWu Technology
Written by

DeWu Technology

A platform for sharing and discussing tech knowledge, guiding you toward the cloud of technology.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.