Big Data 12 min read

How HuoLala Accelerated Ad‑hoc Queries with a Hybrid Offline Engine

This article describes how HuoLala identified slow ad‑hoc query performance in its Hive‑on‑Tez stack, surveyed comparable industry solutions, and built a multi‑engine hybrid offline service that dramatically improves query latency, outlines its architecture, key design decisions, production impact, and future roadmap.

Huolala Tech
Huolala Tech
Huolala Tech
How HuoLala Accelerated Ad‑hoc Queries with a Hybrid Offline Engine

1. Background

HuoLala, a data‑intelligent logistics platform, found that its analysts and developers frequently performed ad‑hoc queries using Hive on Tez, which users reported as slow. To improve SQL query efficiency, the company introduced a hybrid offline engine service that combines multiple query engines.

2. Industry Survey

Similar products in the market include Tencent SuperSQL, Alibaba DLA, Huawei HetuEngine, 360 QuickSQL, and NetEase Kyuubi. They differ in engine composition, open‑source status, SQL standard support, data source connectivity, federation capability, and cloud‑native design.

3. Design and Implementation

3.1 Functional Architecture

The system aims to provide multi‑engine intelligent routing, high stability, scalability, ease of use, and correctness. A Master/Worker architecture is adopted, with HA Master for stability, elastic Workers for scalability, and SDK/JDBC/CLI interfaces for easy client integration. A custom data‑reconciliation tool ensures result correctness across engines.

3.2 System Architecture

Roles include Client, Master, and Worker, communicating via gRPC. The Master uses Zookeeper for high‑availability, persisting service state and handling resource lifecycle, load balancing, and node registration.

Clients (SDK/JDBC/CLI) obtain the Master address, request resources, submit SQL to Workers, and receive results.

Workers manage query execution through several components:

QueryManager tracks query metadata and status.

Dispatcher performs feature analysis (e.g., query type, data volume, operator count) to select the most suitable engine.

SQL Converter rewrites the original SQL to the target engine’s dialect.

QueryExecutor runs the query on the chosen engine, collects logs, reports status, and returns results.

If execution fails, QueryManager falls back to a secondary engine (typically Hive) to ensure completion.

3.2.4 Query Lifecycle

The query state machine includes six states: INITIALIZED, RUNNING, FINISHED, CLOSED, TIMEOUT, and ERROR, governing the progression and termination of each query.

3.3 Other Observations

Client behavior is driven by query state, requiring careful handling to avoid performance overhead.

The HBO service caches routing decisions for frequently repeated queries, reducing dispatch latency.

Engine compatibility challenges (e.g., Hive’s batch output vs. Presto’s pipeline streaming) necessitate fallback mechanisms and dynamic result handling.

4. Production Impact

Since its launch in January 2022, the hybrid engine has delivered significant latency reductions: the 65th percentile query time improved by 82% (from 50 s to 8 s) and the 75th percentile by 64.1% (from 78 s to 28 s).

5. Future Plans

Increase query efficiency by refining routing rules to direct more queries to Presto, especially complex ones.

Add Spark engine support to handle ETL workloads within the same service.

Enable multi‑data‑source federation to break data silos across offline warehouses, OLAP, and online stores.

distributed systemsBig DataSQL Routingad-hoc queryhybrid engine
Huolala Tech
Written by

Huolala Tech

Technology reshapes logistics

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.