Trino at Xiaomi: Architecture, Practices, and Future Plans
This article details Xiaomi’s practical deployment of Trino, covering its architectural role, core and extended capabilities, performance comparisons, integration with Iceberg and Spark, operational enhancements, multi‑cluster and ad‑hoc query scenarios, future cloud‑storage plans, and a Q&A session.
This article presents the experience of using the Trino query engine at Xiaomi, outlining its historical background, architecture, and the reasons for its adoption.
Trino’s master‑slave architecture consists of a Coordinator that generates and schedules tasks and Workers that execute them, supporting multiple clients such as CLI, JDBC, and HTTP. It can query various data sources, enabling federated queries without relying on the Hadoop ecosystem.
Performance benchmarks against Spark using the TPC‑DS 1 TB dataset show Trino achieving up to five times faster query execution, though some queries fail due to memory constraints.
The advantages of Trino include a clear architecture, high speed from in‑memory processing and pipeline execution, and strong extensibility through connectors. Drawbacks are high memory requirements, limited fault tolerance, and constrained concurrency due to a single Coordinator.
In Xiaomi’s OLAP stack, Trino works alongside Spark behind the Kyuubi SQL proxy. Kyuubi routes queries to either engine, with Trino handling fast, read‑only analytics while Spark manages ETL and writes. Metacat provides unified metadata management.
Trino’s positioning at Xiaomi emphasizes three goals: handling large, diverse data sources, delivering faster query response, and improving data visibility for users through ad‑hoc queries, dashboards, and reports.
Since its introduction in Q4 2021, Xiaomi has upgraded Trino several times (versions 352, 386, 421), contributing patches to the community for features such as row‑level updates on Iceberg V2 tables and memory optimizations.
Key development work includes:
Ensuring Spark‑SQL compatibility by rewriting Spark ASTs to Trino syntax and implementing implicit type conversion, achieving over 80 % compatibility.
Optimizing Iceberg usage, adding support for dynamic catalog loading and dynamic UDF loading, allowing thousands of catalogs to be registered without restarting the cluster.
Improving operational capabilities with audit‑log streaming to Iceberg via Flink, automated integration testing, and CI/CD pipelines for seamless deployment.
Future plans focus on migrating storage to the cloud and accelerating queries with Alluxio caching.
Application scenarios discussed include multi‑cluster deployment for high availability, ad‑hoc queries with strict latency limits, and BI analytics requiring fast report generation.
The article concludes with a Q&A session addressing engine selection between Trino and Doris, and the open‑source status of the Spark‑to‑Trino SQL rewrite layer.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.