Databases 17 min read

How 58.com Achieved 20× Faster Real‑Time Queries by Migrating to StarRocks

58.com integrated the StarRocks analytical engine into its data‑exploration platform, replacing Spark/Hive, to overcome minute‑level latency, and after a year of migration achieved over 20× query speedup, 98%+ success rate, and solved numerous Spark‑StarRocks compatibility issues while also moving the service to the cloud.

StarRocks

Jan 14, 2025

Background

58.com’s Data Exploration Platform processes more than 100,000 SQL statements per day on data stored in HDFS. The original execution engine combined Spark and Hive, resulting in minute‑level query latency that could not satisfy the growing demand for fast ad‑hoc analysis.

Why StarRocks

Provides a unified data‑lake analysis capability through a simple Hive catalog.

Massively Parallel Processing (MPP) with vectorized execution, delivering >10× speedup in internal POC tests.

Straightforward architecture and low operational overhead.

Integrated Architecture

After integration, Kyuubi routes each query based on a rule set: if the query meets StarRocks’ execution requirements it is sent to StarRocks; otherwise it falls back to Spark. This routing is transparent to end users.

Compatibility Challenges and Solutions

1. Syntax Parsing

StarRocks treats table aliases as case‑sensitive, unlike Spark. The team patched StarRocks source code and added a SQLGlot plugin that automatically quotes reserved keywords (e.g., key, show, system, group) so they can be used in StarRocks.

2. Unsupported Syntax

Community‑edition StarRocks lacked support for LATERAL VIEW, GROUP BY … WITH CUBE, and GROUPING SETS. The engine was extended to parse and execute these constructs.

3. Metadata Binding

Hive catalog metadata caching caused stale results after a partition was re‑processed. The solution was to disable all Hive metadata caches, which did not increase MetaStore load because Spark also bypasses caching.

4. Query Optimization

Implicit type‑conversion rules differed between Spark and StarRocks, leading to inconsistent results. StarRocks’ conversion logic was aligned with Spark’s rules, eliminating mismatches.

5. Execution Stage

Compatibility issues were found with text+LZO Hive tables, Map‑type fields, custom field delimiters, temporary files, and empty‑file handling. Patches were contributed to StarRocks to support these scenarios.

6. Function Compatibility

More than 40 functions (date, string, regex, aggregation, etc.) were made compatible by:

Mapping Spark functions to existing StarRocks equivalents.

Implementing Java UDFs for functions not present in StarRocks.

Modifying StarRocks source for complex functions to match Spark semantics.

Practical Experience

Performance

Data cache was enabled and the HDFS client JAR was replaced with a custom version that avoids slow DataNode reads. After this change, P99 query latency improved by ~25 %.

Stability

Upgrading from StarRocks 3.0 to 3.2 and fixing CBO statistics generation for very wide tables reduced Frontend (FE) memory pressure and eliminated frequent Backend (BE) crashes. A specific optimization skipped unnecessary statistics queries for a 3,565‑column table, further lowering FE memory usage.

Usability

Java UDFs can be downloaded directly from HDFS, simplifying deployment.

SQL blacklist state is persisted across nodes, reducing operational cost.

StarRocks on Cloud

To meet cost‑reduction goals, the cluster was containerized on the internal 58 Cloud platform. FE nodes remain on physical machines because they require large memory for metadata; CN (compute) nodes run in containers limited to 5 CPU and 15 GB RAM.

Key mitigations for OOM in containers:

Resource‑group isolation and query queue configuration to control concurrency.

Enabling intermediate‑result spill‑to‑disk.

Manually reducing CN thread‑count parameters to match container CPU limits.

Setting JAVA_OPTS to cap JVM heap size for CN processes.

Overall Gains

~65 k SQL statements migrated daily with >98 % success rate.

Average query time reduced to 3.3 s (P90 ≈ 5 s, P99 ≈ 52 s), >20× faster than the pre‑migration baseline.

Continuous iteration is ongoing to resolve remaining Spark‑StarRocks incompatibilities.

Future Directions

Research is underway to automatically extract frequently used sub‑queries and materialize them as StarRocks materialized views. This would eliminate redundant computation and further support cost‑reduction objectives.

big data StarRocks SQL acceleration Spark compatibility

Written by

StarRocks

StarRocks is an open‑source project under the Linux Foundation, focused on building a high‑performance, scalable analytical database that enables enterprises to create an efficient, unified lake‑house paradigm. It is widely used across many industries worldwide, helping numerous companies enhance their data analytics capabilities.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.