Big Data 16 min read

Optimizing Apache Kylin for Meituan's Sales OLAP: From MapReduce to Spark and Resource Tuning

This article presents a detailed case study of how Meituan's in‑store dining sales team identified severe efficiency issues in their Apache Kylin‑based OLAP system, dissected the construction process, and applied a step‑by‑step optimization roadmap—including engine migration, dimension pruning, resource configuration, and Spark‑based layered building—to boost query performance and achieve near‑perfect SLA.

DataFunTalk

Jan 15, 2021

Optimizing Apache Kylin for Meituan's Sales OLAP: From MapReduce to Spark and Resource Tuning

Background

Since 2016 Meituan's in‑store dining sales platform ("Qingtian") has used Apache Kylin as its OLAP engine. Rapid business growth by 2020 caused severe construction and query inefficiencies, threatening data‑driven decision making.

Problem & Goals

The sales system required multi‑level organization views, precise deduplication for over one‑third of metrics, and peak query loads of tens of thousands. Kylin’s 2^N dimension‑combination explosion and reliance on MapReduce led to long build times, high resource consumption, and SLA shortfalls.

Optimization Principles – Understanding the Fundamentals

Kylin’s pre‑computation creates Cuboids for every dimension combination; queries read the appropriate Cuboid. The By‑layer algorithm computes Cuboids layer‑by‑layer, reusing results from lower layers to avoid redundant work.

Process Analysis – Layered Decomposition

The team broke the build pipeline into five key stages: engine selection, data reading, dictionary building, layered construction, and file conversion. Detailed analysis of each stage revealed specific bottlenecks.

Engine Selection

Switching the build engine from MapReduce to Spark (supported by Kylin since 2017) increased build speed by 1‑3×. The migration was performed gradually, preserving existing MapReduce jobs while optimizing parameters.

Data Reading

Kylin reads source data from Hive external tables stored in HDFS. Small‑file issues were mitigated by adjusting MapReduce split size and merging Hive partitions where appropriate.

Dictionary Building

Dimension dictionaries map raw values to encoded IDs, reducing HBase storage. Global dictionary dependencies were configured to avoid redundant computation for deduplication‑heavy metrics.

Layered Build

With Spark, the By‑layer algorithm is used exclusively. Each Cuboid layer becomes a Spark job; intermediate results are cached in memory, eliminating repeated reads. The number of jobs equals the number of layers, and each job contains two stages (read cache, write cache).

Resource Configuration

Dynamic resource allocation was tuned: each executor provides 1 CPU, 6 GB heap, and 1 GB off‑heap memory. Task parallelism was calculated as

CPU = kylin.engine.spark-conf.spark.executor.cores * number_of_executors

. Memory per task follows Memory = (executor_memory + overhead) * executors. Example: CPU: 1*1000=1000; Memory: (6+1)*1000=7000 GB.

File Conversion

After build, Cuboid files are bulk‑loaded into HBase as HFiles via a MapReduce job. The number of map tasks equals the number of output files from the layered build stage, so resource requests were aligned accordingly.

Implementation Roadmap – From Point to Plane

A pilot on the core sales transaction task demonstrated that the combined optimizations reduced daily build time from over two hours to under ten minutes and raised SLA achievement from 90 % to 99.99 %.

Results

Resource consumption per task decreased dramatically, and overall cluster CU usage fell while maintaining throughput. By June 2020 the SLA hit 100 %.

Outlook

Kylin graduated to an Apache top‑level project in 2015 and continues evolving; Meituan now runs a stable V2.0 version and has begun testing the V3.1 release that replaces Spark with Flink for the build engine, further improving performance.

Author Bio

Yue Qing, Engineer at Meituan’s in‑store dining R&D center since 2019.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance Optimization Big Data OLAP Spark Apache Kylin Meituan

Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.