Big Data 13 min read

Understanding Presto: Architecture, Query Execution, and Youzan’s Practical Experience

This article explains Presto’s core architecture and low‑latency query execution process, describes how Youzan adopts Presto for various data‑platform scenarios, discusses the evolution of its deployment, and outlines the performance challenges and future enhancements such as Alluxio integration and session property management.

DataFunTalk
DataFunTalk
DataFunTalk
Understanding Presto: Architecture, Query Execution, and Youzan’s Practical Experience

1. Presto Introduction

Presto is an open‑source, distributed, high‑performance SQL query engine originally developed by Facebook to address the high latency of Hive’s MapReduce‑based batch processing and to provide interactive, sub‑second query responses.

1.1 Presto Architecture

The architecture consists of a Coordinator, multiple Workers, and a client layer; a diagram is shown in the original article.

1.2 Presto Query Execution Process

Client sends a request to the Coordinator.

SQL is parsed by ANTLR to generate an AST.

The AST undergoes semantic analysis using metadata.

A logical execution plan is created and optimized by rule‑based transformations.

The logical plan is split into stages, and Workers are scheduled to create tasks.

Each task generates a physical execution plan.

The Coordinator stitches stages together after scheduling.

Workers execute their physical plans.

The Client repeatedly pulls results from the Coordinator, which aggregates results from the Workers.

1.3 Why Presto Is High‑Performance

Pipeline execution with full in‑memory computation.

SQL plan rule optimizations.

Dynamic code generation.

Data‑local scheduling, memory‑efficient data structures, caching, and approximate queries.

2. Presto Use Cases at Youzan

Data Platform (DP) temporary queries for exploratory analysis, with data masking and auditing.

BI reporting engine delivering various analytical dashboards.

Metadata quality checks via Presto.

Data products such as CRM analytics and user‑profile calculations.

3. Evolution of Presto at Youzan

Stage 1: Mixed Deployment with Hadoop

Presto was initially co‑deployed with an offline Hadoop cluster, leading to unstable performance because Hadoop’s disk I/O sometimes saturated the network, causing long task elapsed times.

Stage 2: Fully Independent Presto Cluster

Youzan separated Presto into its own cluster with a dedicated HDFS environment, avoiding interference from Hadoop jobs. Data tables were shared via a Hive catalog that pointed to the Presto HDFS NameService.

Stage 3: Low‑Latency Dedicated Clusters

For business units requiring sub‑second response times (often <3 s, sometimes <1 s), Youzan provisioned dedicated Presto clusters with local HDFS, tuned task concurrency, and performed deep performance testing.

4. Problems Encountered in Production

4.1 HDFS Small‑File Issue

Many small files caused slow queries; two configuration parameters were increased to allow more splits per node and per task.

node-scheduler.max-splits-per-node=100
node-scheduler.max-pending-splits-per-task=10

Additionally, Spark and Hive ETL pipelines introduced adaptive Spark and file‑merge tools to reduce small‑file proliferation.

4.2 Exponential‑Backtracking Regex Issue

A user’s long‑running query was traced to a pathological regular expression in Presto’s Joni engine. The team limited query runtime and considered switching to Google RE2J, which sacrifices some regex features.

4.3 Multiple‑Column DISTINCT Problem

Queries with several COUNT(DISTINCT …) columns performed poorly. Workarounds included rewriting queries using grouping sets or approximate distinct functions, and contributing patches to the Presto community.

4.4 HDFS NameNode Latency

Occasional 1‑second delays occurred when the NameNode was rolling its edit log and held a write lock, blocking read requests. Mitigations discussed include observer NameNode, Alluxio caching, and SSD‑based NameNode storage.

5. Future Outlook

5.1 Presto + Alluxio

Alluxio’s fine‑grained memory control can provide more stable I/O performance than OS page cache, especially for workloads where disk I/O becomes the bottleneck.

5.2 Presto Session Property Managers

Newer Presto versions support session‑level property managers that automatically adjust configurations based on workload characteristics, relieving users from manual tuning.

5.3 Multi‑Tenant Isolation

While Presto does not yet integrate with Apache Ranger for tenant isolation, Youzan plans to develop a SQL‑parser service to enforce data masking, auditing, and eventually Ranger‑based row‑level security.

Thank you for reading.

Performance Optimizationbig dataSQLprestoYouzanDistributed Query Engine
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.