How 58.com Achieved 20× Faster Real‑Time Queries by Migrating to StarRocks
58.com integrated the StarRocks analytical engine into its data‑exploration platform, replacing Spark/Hive, to overcome minute‑level latency, and after a year of migration achieved over 20× query speedup, 98%+ success rate, and solved numerous Spark‑StarRocks compatibility issues while also moving the service to the cloud.
Background
58.com’s Data Exploration Platform processes more than 100,000 SQL statements per day on data stored in HDFS. The original execution engine combined Spark and Hive, resulting in minute‑level query latency that could not satisfy the growing demand for fast ad‑hoc analysis.
Why StarRocks
Provides a unified data‑lake analysis capability through a simple Hive catalog.
Massively Parallel Processing (MPP) with vectorized execution, delivering >10× speedup in internal POC tests.
Straightforward architecture and low operational overhead.
Integrated Architecture
After integration, Kyuubi routes each query based on a rule set: if the query meets StarRocks’ execution requirements it is sent to StarRocks; otherwise it falls back to Spark. This routing is transparent to end users.
Compatibility Challenges and Solutions
1. Syntax Parsing
StarRocks treats table aliases as case‑sensitive, unlike Spark. The team patched StarRocks source code and added a SQLGlot plugin that automatically quotes reserved keywords (e.g., key, show, system, group) so they can be used in StarRocks.
2. Unsupported Syntax
Community‑edition StarRocks lacked support for LATERAL VIEW, GROUP BY … WITH CUBE, and GROUPING SETS. The engine was extended to parse and execute these constructs.
3. Metadata Binding
Hive catalog metadata caching caused stale results after a partition was re‑processed. The solution was to disable all Hive metadata caches, which did not increase MetaStore load because Spark also bypasses caching.
4. Query Optimization
Implicit type‑conversion rules differed between Spark and StarRocks, leading to inconsistent results. StarRocks’ conversion logic was aligned with Spark’s rules, eliminating mismatches.
5. Execution Stage
Compatibility issues were found with text+LZO Hive tables, Map‑type fields, custom field delimiters, temporary files, and empty‑file handling. Patches were contributed to StarRocks to support these scenarios.
6. Function Compatibility
More than 40 functions (date, string, regex, aggregation, etc.) were made compatible by:
Mapping Spark functions to existing StarRocks equivalents.
Implementing Java UDFs for functions not present in StarRocks.
Modifying StarRocks source for complex functions to match Spark semantics.
Practical Experience
Performance
Data cache was enabled and the HDFS client JAR was replaced with a custom version that avoids slow DataNode reads. After this change, P99 query latency improved by ~25 %.
Stability
Upgrading from StarRocks 3.0 to 3.2 and fixing CBO statistics generation for very wide tables reduced Frontend (FE) memory pressure and eliminated frequent Backend (BE) crashes. A specific optimization skipped unnecessary statistics queries for a 3,565‑column table, further lowering FE memory usage.
Usability
Java UDFs can be downloaded directly from HDFS, simplifying deployment.
SQL blacklist state is persisted across nodes, reducing operational cost.
StarRocks on Cloud
To meet cost‑reduction goals, the cluster was containerized on the internal 58 Cloud platform. FE nodes remain on physical machines because they require large memory for metadata; CN (compute) nodes run in containers limited to 5 CPU and 15 GB RAM.
Key mitigations for OOM in containers:
Resource‑group isolation and query queue configuration to control concurrency.
Enabling intermediate‑result spill‑to‑disk.
Manually reducing CN thread‑count parameters to match container CPU limits.
Setting JAVA_OPTS to cap JVM heap size for CN processes.
Overall Gains
~65 k SQL statements migrated daily with >98 % success rate.
Average query time reduced to 3.3 s (P90 ≈ 5 s, P99 ≈ 52 s), >20× faster than the pre‑migration baseline.
Continuous iteration is ongoing to resolve remaining Spark‑StarRocks incompatibilities.
Future Directions
Research is underway to automatically extract frequently used sub‑queries and materialize them as StarRocks materialized views. This would eliminate redundant computation and further support cost‑reduction objectives.
StarRocks
StarRocks is an open‑source project under the Linux Foundation, focused on building a high‑performance, scalable analytical database that enables enterprises to create an efficient, unified lake‑house paradigm. It is widely used across many industries worldwide, helping numerous companies enhance their data analytics capabilities.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
