Big Data 16 min read

Presto High‑Performance Engine Practice at Meitu: Technical Selection, HA Design, and Cross‑Cluster Scheduling

This article details Meitu's adoption of the Presto ad‑hoc ROLAP engine, comparing it with Hive on Spark and Impala, describing enhancements for coordinator high‑availability, and explaining a cross‑cluster scheduling strategy that leverages idle Presto resources to improve overall big‑data workload efficiency.

DataFunTalk
DataFunTalk
DataFunTalk
Presto High‑Performance Engine Practice at Meitu: Technical Selection, HA Design, and Cross‑Cluster Scheduling

The presentation introduces Meitu's use of Presto as a high‑performance, ad‑hoc ROLAP solution, explaining why Presto was chosen over Hive on Spark and Impala for its flexibility, low latency, and strong aggregation capabilities.

It outlines the advantages of ROLAP (flexible queries, high performance, real‑time data support) and its drawbacks (high memory consumption, performance degradation on very large time‑range queries, and lack of built‑in HA).

A comparative evaluation of three ad‑hoc engines—Hive on Spark, Impala, and Presto—is provided, highlighting each tool's strengths and weaknesses and showing that Presto scores highest (39 points) for Meitu's workload.

The article then addresses Presto's single‑point‑of‑failure coordinator issue and proposes two HA solutions: (1) a dual‑cluster deployment with session state persisted in a database, and (2) a master‑slave setup using KeepAlived and a virtual IP to enable seamless failover. Implementation steps for the second solution are described.

Next, the cross‑cluster scheduling concept is introduced: offline Hive‑on‑Spark clusters and the online Presto cluster share cloud‑based storage, allowing idle Presto resources (0:00‑9:00) to be used for offline tasks, reducing offline cluster load by 10% and boosting Presto performance by 19%.

The architecture evolution, forwarding strategy, and detailed implementation steps—including Hive Server deployment on Presto, third‑party package integration, intelligent engine development, and gradual gray‑release—are explained.

Future outlook discusses task classification (large‑shuffle, medium‑large, small) and recommends using Hive on Spark for massive tasks, Spark SQL or Presto for medium tasks, and Presto for small tasks, while noting challenges such as UDF compatibility, syntax differences, and unified permission checks.

A short Q&A covers considerations of Doris vs. ClickHouse, Presto vs. Hive‑on‑MapReduce performance, and the operation of the intelligent engine that routes tasks to the most suitable engine based on historical metrics.

The session concludes with thanks and community promotion.

Data EngineeringBig Datahigh performanceprestoHACross-Cluster Scheduling
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.