Big Data 10 min read

Integrating Apache Kylin with MLSQL for In‑Place ETL and Analytics

The article explains how Apache Kylin and MLSQL complement each other, detailing Kylin's OLAP strengths, MLSQL's data‑processing and AI capabilities, and demonstrates a low‑code integration that enables users to perform ETL directly within Kylin’s interface while outlining future deep‑link scenarios.

Big Data Technology Architecture
Big Data Technology Architecture
Big Data Technology Architecture
Integrating Apache Kylin with MLSQL for In‑Place ETL and Analytics

At a recent Apache Kylin + MLSQL meetup, speaker William Zhu demonstrated how to perform large‑scale data‑processing pipelines inside Kylin without leaving the platform, highlighting the complementary nature of Kylin’s high‑concurrency OLAP engine and MLSQL’s ETL and AI‑oriented language.

Kylin excels at BI workloads with fast, sub‑second queries and a rich ecosystem of BI tools, but lacks strong ETL capabilities. MLSQL, built on Spark, fills this gap by offering powerful data‑processing and machine‑learning functions.

The integration is achieved by adding a simple annotation such as --%mlsql followed by an MLSQL engine address, allowing Kylin to dispatch ETL scripts to MLSQL while preserving its native query flow.

A demo showed how a CSV file can be loaded into Hive via MLSQL, enabling Kylin to build cubes with only a few code modifications, illustrating a shallow pre‑link scenario where MLSQL handles the ETL stage before Kylin’s cube construction.

The article also describes deeper integration possibilities, including pushing cube‑building tasks to MLSQL (deep pre‑link) and post‑link scenarios where MLSQL consumes Kylin’s results for further joins or AI processing.

Future plans involve extending MLSQL to support cube construction, tighter API coupling, and offering plug‑in mechanisms so users can execute full ETL pipelines within Kylin without managing separate infrastructure.

analyticsbig dataSQLETLdata integrationKylinMLSQL
Big Data Technology Architecture
Written by

Big Data Technology Architecture

Exploring Open Source Big Data and AI Technologies

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.