How ByteDance’s AQETuner Cuts Query Latency by 23% and Boosts Reliability
ByteDance Data Platform’s recent breakthroughs in database research—spanning query‑level Bayesian tuning, adaptive stream‑processing parallelism, and learned cardinality estimation—were highlighted by two papers accepted at VLDB 2025 and ICDE 2025, showcasing significant performance gains and real‑world deployments.
Amid the rapid convergence of massive data and AI, Data+AI continues to unlock commercial potential across industries.
Recently, ByteDance Data Platform achieved notable success in the database field, with two research papers accepted at premier international conferences.
AQETuner: Reliable Query-level Configuration Tuning for Analytical Query Engines was selected for the VLDB 2025 Research Track.
The AQETuner system employs Bayesian optimization backed by a neural‑process surrogate model to deliver precise query‑level parameter tuning for analytical engines such as ByteHouse and Presto. It uses a joint parameter‑plan encoder to capture complex interactions and a gating network to uncover hidden links between performance and reliability, achieving up to 23.7% latency reduction and a 51.2% decrease in query failures. The technology will be released through Volcano Engine’s ByteHouse.
Learning from the Past: Adaptive Parallelism Tuning for Stream Processing Systems was accepted for the ICDE 2025 Research Track.
This work introduces a pre‑training and fine‑tuning framework that leverages graph‑edit‑distance DAG clustering and a monotonic‑constrained GNN encoder to predict operator‑level bottlenecks, enabling efficient and accurate parallelism optimization. Experiments on an internal Flink cluster show up to 29.6% fewer reconfigurations and a reduction of total parallelism to 69.2% of the original, balancing resource usage and performance. The associated Flink AutoScaling system now serves over 200 k cores, saving more than 30 k cores.
Earlier, the platform presented ByteCard: Enhancing ByteDance’s Data Warehouse with Learned Cardinality Estimation at SIGMOD 2024, addressing classic cardinality‑estimation challenges. ByteCard integrates a learned model into ByteHouse, markedly improving query performance, cutting I/O costs, and reducing hash‑table rebuild frequency, with up to 30% latency reduction on the 99th‑percentile across multiple datasets.
From parameter tuning to stream‑processing optimization, ByteDance Data Platform’s research is steadily transitioning into enterprise‑grade solutions powered by Volcano Engine, empowering diverse industries.
ByteDance Data Platform
The ByteDance Data Platform team empowers all ByteDance business lines by lowering data‑application barriers, aiming to build data‑driven intelligent enterprises, enable digital transformation across industries, and create greater social value. Internally it supports most ByteDance units; externally it delivers data‑intelligence products under the Volcano Engine brand to enterprise customers.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.