Big Data 10 min read

Design and Practice of 360’s Multi‑Data‑Source Unified SQL Query Engine

The article presents 360’s challenges with heterogeneous, high‑volume data sources, explains the design of a unified federated SQL engine called QuickSQL that leverages Apache Calcite, Spark, Flink and other back‑ends, and evaluates its performance and future development directions.

360 Tech Engineering

Aug 27, 2019

Design and Practice of 360’s Multi‑Data‑Source Unified SQL Query Engine

With the rapid growth of business lines at 360, data is generated from many heterogeneous sources (MySQL, Hive, Elasticsearch, etc.) at petabyte scale, creating high latency and high‑cost challenges for analysts who must manually join and transform data across isolated storage media.

Two analyst personas emerge: a technically‑oriented group proficient in Spark/Flink and a business‑oriented group skilled in statistical modeling, both facing difficulty when the underlying platform cannot keep pace with evolving data‑processing needs.

To address these pain points, 360 built a unified SQL façade named QuickSQL. The engine uses Apache Calcite as a top‑level parser to capture user intent, then dynamically routes sub‑queries to the most suitable execution engine (Hive‑Spark, Flink, MySQL, Elasticsearch) based on a set of split points such as cross‑source joins or unsupported functions.

QuickSQL’s architecture consists of three layers: a parsing layer that interacts with a metadata store and performs permission checks, an interpretation layer that translates logical plans into engine‑specific dialects, and a runtime layer that pushes down aggregations and extracts results. The system supports both CLI and programmatic interfaces.

In practice, the engine was applied to the QNote interactive analysis platform, which provides a unified query service for Hive, MySQL, Elasticsearch and other sources, enabling cross‑source joins, CSV imports, and a three‑tier service architecture (query service, hybrid compute layer, runtime layer).

Performance tests show that QuickSQL incurs a modest parsing overhead (≈0.5 s) but achieves comparable or better query latency than native Hive, MySQL, or Elasticsearch queries, especially in mixed‑source scenarios where push‑down execution leverages source‑side indexes.

Future work includes adding streaming SQL support, extending the SQL grammar with UDFs, integrating additional sources such as MongoDB and Druid, and providing richer SDKs for application integration.

QuickSQL is open‑source (https://github.com/Qihoo360/Quicksql) and accompanied by related articles and community resources.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance SQL Data Integration Federated Query Apache Calcite quicksql

Written by

360 Tech Engineering

Official tech channel of 360, building the most professional technology aggregation platform for the brand.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.