Big Data 11 min read

ByteDance's Internal Presto OLAP Engine: Deployment, Performance Boosts, and Operational Practices

The article details ByteDance's large‑scale deployment of the Presto OLAP engine for ad‑hoc, BI, and near‑real‑time analytics, describing its architecture, multi‑coordinator high‑availability design, routing gateway, adaptive cancel, history server, materialized‑view support, Hudi connector integration, and how these innovations improve performance, stability, and operational efficiency.

Big Data Technology & Architecture
Big Data Technology & Architecture
Big Data Technology & Architecture
ByteDance's Internal Presto OLAP Engine: Deployment, Performance Boosts, and Operational Practices

Within ByteDance, Presto powers ad‑hoc queries, BI visual analysis, and near‑real‑time analytics, handling close to one million daily queries. It is fully compatible with SparkSQL, enabling seamless migration, and delivers an 80.5% performance gain on the TPC‑DS benchmark compared to community versions.

The platform adopts a multi‑Coordinator architecture, eliminating the single‑point‑of‑failure of a lone coordinator and reducing failover time to under three seconds. Routing is managed by a unified Gateway that applies static routing rules and a dynamic load‑balancing strategy based on real‑time coordinator metrics.

To enhance stability, Presto integrates a History Server that persists query execution details, allowing users to review past queries and providing data for monitoring dashboards.

Presto Cluster Stability and Performance Improvements

Multiple independent Presto clusters are deployed for different business scenarios, each with several Coordinators and Workers. Zookeeper is used for service discovery, enabling active‑active coordinator failover and coordinated load reporting via RESTful APIs.

Adaptive Cancel predicts query runtime using rule‑based and cost‑based models; queries projected to exceed thresholds are cancelled early, preventing resource waste.

Optimizations for Specific Workloads

1. Ad‑hoc Query Analysis

Presto’s MPP architecture eliminates Spark context startup overhead, offering lower latency. Compatibility layers rewrite SparkSQL to Presto syntax, and Hive UDFs are supported, with contributions back to the Presto community.

2. BI Visual Analytics

Materialized views are introduced to accelerate repetitive BI queries, featuring automatic view discovery, lifecycle management, and query rewrite capabilities.

3. Near‑Real‑Time Query Analysis

By integrating Hudi as a dedicated connector, Presto can efficiently read incrementally updated Hudi tables, avoiding OOM issues and simplifying version upgrades.

All these enhancements are packaged into ByteDance’s Lakehouse Analytics Service (LAS), a serverless data‑processing offering compatible with Spark, Presto, and Flink ecosystems.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Big Datahigh availabilityquery optimizationOLAPMaterialized ViewsPrestoHudi Connector
Big Data Technology & Architecture
Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.