ByteDance's Internal Presto OLAP Engine: Deployment, Performance Boosts, and Operational Practices
The article details ByteDance's large‑scale deployment of the Presto OLAP engine for ad‑hoc, BI, and near‑real‑time analytics, describing its architecture, multi‑coordinator high‑availability design, routing gateway, adaptive cancel, history server, materialized‑view support, Hudi connector integration, and how these innovations improve performance, stability, and operational efficiency.
Within ByteDance, Presto powers ad‑hoc queries, BI visual analysis, and near‑real‑time analytics, handling close to one million daily queries. It is fully compatible with SparkSQL, enabling seamless migration, and delivers an 80.5% performance gain on the TPC‑DS benchmark compared to community versions.
The platform adopts a multi‑Coordinator architecture, eliminating the single‑point‑of‑failure of a lone coordinator and reducing failover time to under three seconds. Routing is managed by a unified Gateway that applies static routing rules and a dynamic load‑balancing strategy based on real‑time coordinator metrics.
To enhance stability, Presto integrates a History Server that persists query execution details, allowing users to review past queries and providing data for monitoring dashboards.
Presto Cluster Stability and Performance Improvements
Multiple independent Presto clusters are deployed for different business scenarios, each with several Coordinators and Workers. Zookeeper is used for service discovery, enabling active‑active coordinator failover and coordinated load reporting via RESTful APIs.
Adaptive Cancel predicts query runtime using rule‑based and cost‑based models; queries projected to exceed thresholds are cancelled early, preventing resource waste.
Optimizations for Specific Workloads
1. Ad‑hoc Query Analysis
Presto’s MPP architecture eliminates Spark context startup overhead, offering lower latency. Compatibility layers rewrite SparkSQL to Presto syntax, and Hive UDFs are supported, with contributions back to the Presto community.
2. BI Visual Analytics
Materialized views are introduced to accelerate repetitive BI queries, featuring automatic view discovery, lifecycle management, and query rewrite capabilities.
3. Near‑Real‑Time Query Analysis
By integrating Hudi as a dedicated connector, Presto can efficiently read incrementally updated Hudi tables, avoiding OOM issues and simplifying version upgrades.
All these enhancements are packaged into ByteDance’s Lakehouse Analytics Service (LAS), a serverless data‑processing offering compatible with Spark, Presto, and Flink ecosystems.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Big Data Technology & Architecture
Wang Zhiwu, a big data expert, dedicated to sharing big data technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
