Big Data 4 min read

Introduction to Apache Kylin: A Fast Big Data OLAP Engine

Apache Kylin is an open‑source, Hadoop‑based OLAP engine that provides sub‑second, multi‑dimensional SQL queries on massive datasets, with features such as cube pre‑computation, real‑time analytics, and seamless BI tool integration, and its latest v2.6.4 release adds numerous fixes and improvements.

Big Data Technology Architecture

Oct 15, 2019

Introduction to Apache Kylin: A Fast Big Data OLAP Engine

Apache Kylin v2.6.4 has just been released, bringing many bug fixes and improvements; the project has evolved rapidly from version 1.5.3 three years ago to become an indispensable OLAP engine in the Hadoop ecosystem.

Apache Kylin, originally open‑sourced by eBay, provides a SQL query interface and multi‑dimensional analysis on top of Hadoop/Spark, delivering sub‑second query latency for massive datasets through pre‑computation and cube building.

The underlying data is stored in HBase, while data ingestion and cube building can be performed via Hive, Kafka, or JDBC sources (available since v2.3.0).

Key features and characteristics include:

Ultra‑fast OLAP engine that reduces query latency on hundred‑billion‑row datasets.

ANSI‑SQL query support with a comprehensive SQL interface.

Interactive query capability with sub‑second response times.

Multi‑dimensional cubes: Kylin defines data models and builds cubes for datasets exceeding a hundred billion rows.

Real‑time OLAP: data can be processed as it arrives, enabling multi‑dimensional analysis with second‑level latency.

Seamless integration with BI tools such as Tableau and PowerBI.

For further learning, the official documentation (including installation, cube building tutorials, and tool integration) is recommended:

http://kylin.apache.org/docs/

A Chinese version of the site is also available: http://kylin.apache.org/cn/docs/

The source code is hosted on GitHub: https://github.com/apache/kylin

Mailing lists for developers and users are [email protected] and [email protected]; subscriptions can be made by emailing [email protected] or [email protected].

Additional recommended reading includes articles on Elasticsearch performance monitoring, HBase internals, and monitoring platforms based on Telegraf, InfluxDB, and Grafana.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

SQL real-time analytics OLAP Hadoop Apache Kylin BI Integration

Written by

Big Data Technology Architecture

Exploring Open Source Big Data and AI Technologies

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.