How StarRocks Outperformed ClickHouse in Real‑Time Insurance Data Analytics
This article presents a technical case study of ZhongAn's Jizhi analytics platform, detailing how switching from ClickHouse to the MPP OLAP engine StarRocks resolved multi‑concurrency and join performance bottlenecks, improved real‑time query speed, and enabled near‑billion‑row data handling for insurance business operations.
Industry Background
Insurance data pipelines generate massive heterogeneous datasets from underwriting, sales, claims, health, e‑commerce and credit sources. Turning this data into actionable insights requires high‑performance, low‑latency analytical engines that can handle both batch and real‑time workloads.
Jizhi Platform Overview
Jizhi is a visual intelligence platform that provides zero‑code drag‑and‑drop analytics over billions of rows. It integrates AI, BI and a data‑warehouse engine to serve thousands of internal users.
Business Requirements
Rich function library for chart calculations.
Fast interactive response for multi‑chart dashboards.
Support for multi‑dimensional drill‑down on large detail tables.
Real‑time ingestion and query capabilities.
Existing Architecture with ClickHouse
Offline data is collected via DataX into MaxCompute/Hive, then loaded into ClickHouse. Real‑time data flows from binlog to Kafka, is processed by Flink, and finally lands in ClickHouse using the ReplacingReplicatedMergeTree engine.
Key limitations observed:
Performance degrades sharply under 6‑8 concurrent dashboard queries (response time rises from ~2 s to 8‑10 s).
Multi‑table joins often exceed 10 s or time out.
No transaction support; heavy reliance on ZooKeeper leads to metadata inconsistencies.
Missing automatic resharding for horizontal scaling.
Replacing engine suffers from Merge‑On‑Read, single‑threaded scans, inability to push predicates, no delete support, and partition‑aware deduplication.
StarRocks Evaluation
Supports >10 k QPS with high concurrency.
Provides Shuffle Join, Colocate Join and other distributed join strategies for superior multi‑table performance.
Transactional DDL/DML compatible with MySQL protocol.
Simplified FE/BE architecture without external dependencies (no ZooKeeper).
Automatic data balancing and easy horizontal scaling.
Performance Benchmark (SSB)
Test environment: four 8‑core machines, StarRocks 2.1.0 vs ClickHouse 21.9.5. Cache was cleared before each run.
Single‑table, no concurrency: comparable latency; CPU usage 25‑50 % of ClickHouse.
Single‑table, high concurrency: ~1.8× faster.
Multi‑table, no concurrency: ~1.8× faster.
Multi‑table, high concurrency: ~8× faster.
Real‑time primary‑key model vs ClickHouse Replacing engine: 3‑10× faster with stable latency.
Batch write throughput: 20‑30 % slower than ClickHouse (acceptable trade‑off).
Integration of StarRocks into Jizhi
Create a data model in Jizhi and select StarRocks as the engine. Configure index fields, distribution keys, time partitions and retention period.
Obtain the generated StarRocks table connection information and the auto‑generated Flink SQL SINK statement.
Deploy a real‑time Flink job that writes source data into the StarRocks table.
Adjust field formats, add derived fields, and design chart visualizations on the model.
Publish the dashboard to obtain a permanent link for business users.
Business Impact
In an online insurance channel‑placement scenario, dashboard load time dropped from >10 s to ~3 s, enabling timely strategy adjustments. Data volume support increased from millions to nearly 100 million rows, improving stability and user trust while reducing CPU pressure.
Conclusion and Future Plans
StarRocks matches ClickHouse on single‑table queries but outperforms it in multi‑concurrency and multi‑join workloads, especially for real‑time warehouses. Its transactional DDL/DML, MySQL compatibility and simpler operations make it attractive for both development and operations teams. Planned extensions include:
Unified batch‑stream analytics using StarRocks for offline workloads.
Exploration of StarRocks as a lightweight data‑warehouse and unified query engine.
Application of StarRocks to user‑behavior analytics and profiling.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
StarRocks
StarRocks is an open‑source project under the Linux Foundation, focused on building a high‑performance, scalable analytical database that enables enterprises to create an efficient, unified lake‑house paradigm. It is widely used across many industries worldwide, helping numerous companies enhance their data analytics capabilities.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
