Big Data 12 min read

QuickSQL: 360’s Unified Multi-Source Query Engine Explained

This article outlines how 360’s data center built QuickSQL, a federated SQL engine that unifies queries across heterogeneous sources such as Hive, MySQL, and Elasticsearch, detailing the business challenges, architectural design, performance benchmarks, and future roadmap for multi‑source data analysis.

360 Zhihui Cloud Developer

Sep 3, 2019

QuickSQL: 360’s Unified Multi-Source Query Engine Explained

1. 360 Data Analysis Scenario

360 has many business lines beyond the well‑known PC security guard, browser, search, and mobile apps, covering search, security, video feed, games, finance, and IoT. These generate over 300 TB of new data points daily, amounting to nearly 10 billion records and covering more than 1 billion users. Data is stored across various media, requiring extraction and merging for analysis.

For example, a business line stores small channel activity data in MySQL while real‑time download data lands in HDFS; joining them is cumbersome.

2. Problems and Challenges

Data analysts split into two tracks: a technical track (Spark, Flink, Python, R) that excels at fast data cleaning, and a business track (regression, ML algorithms) that extracts insights but may lack deep technical knowledge. Both face high learning and analysis costs when data sources are heterogeneous and isolated.

Without an evolving data‑processing platform, business slows down, and product iteration suffers.

Therefore a unified query paradigm is needed to hide underlying source details and reduce cost and complexity.

3. Solution and Approach

We adopt a dynamic scheduling idea, using Apache Calcite as the top‑level parser to understand user intent, then route the query to the appropriate engine.

QuickSQL is our open‑source federated query façade that provides a standard SQL interface, logical‑tree optimization, and runtime engine selection, fully decoupled from underlying engines.

The architecture consists of three layers: parsing, interpretation, and runtime.

Parsing layer validates statements, interacts with the metadata store, and performs table/field‑level permission checks.

Interpretation layer translates dialects and routes queries; for single‑source queries it uses JDBC, while mixed queries leverage Spark or Flink as distributed compute engines.

Runtime layer pushes down statements for pre‑aggregation and extraction.

During planning, the logical plan is split at cut points (cross‑source joins/sub‑queries or unsupported operations) into a plan forest, each subtree converted for its target engine and executed via temporary tables.

4. Engine Practice Case

QNote is an interactive analysis platform built on QuickSQL, allowing users to query Hive, MySQL, Elasticsearch, etc., perform cross‑source joins, and import CSV for further analysis.

QNote architecture includes three layers: query service, hybrid compute, and runtime.

Query Service Layer : provides user‑side functions such as query management, monitoring dashboard, editor, subscription, and scheduling.

Hybrid Compute Layer : consists of an SDK for the query service and an executor that handles resource scheduling, unified parsing, and state synchronization.

Runtime Layer : connects to all federated data sources, handling push‑down execution and result extraction.

The initial design followed a Zeppelin‑style Thrift RPC model with query service and executor proxy balanced by Zookeeper.

Issues such as deployment difficulty, resource leakage, low utilization, poor parallelism, and limited scalability prompted a redesign.

After refactoring, the system decouples service and compute via a message queue, enabling fully parallel queries, high availability, and better resource scheduling. Both service and compute register with Eureka for health checks.

5. Query Performance Comparison

Benchmarks show QuickSQL adds about 0.5 s parsing overhead for MySQL and Elasticsearch queries, but outperforms native Hive queries because it uses Spark‑Hive as the execution engine.

In mixed‑source join scenarios, QuickSQL leverages source indexes for push‑down execution, achieving significantly better performance than pure Hive joins on the same data scale.

6. Future Directions

Integrate streaming SQL to unify batch and stream processing.

Enrich SQL syntax with user‑defined functions (UDFs).

Add support for MongoDB, Druid, and other common data sources.

Provide richer application integration methods to accelerate adoption.

QuickSQL is open‑source at https://github.com/Qihoo360/Quicksql.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data Data Integration SQL Engine performance benchmarking Federated Query

Written by

360 Zhihui Cloud Developer

360 Zhihui Cloud is an enterprise open service platform that aims to "aggregate data value and empower an intelligent future," leveraging 360's extensive product and technology resources to deliver platform services to customers.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.