Tag

Trino

0 views collected around this technical thread.

360 Smart Cloud
360 Smart Cloud
May 23, 2024 · Big Data

Archer Engine: Integrating Inverted Index with Iceberg for Scalable Big Data Log Analytics

The article introduces Archer, a new big‑data warehouse engine built on Iceberg that adds an inverted‑index mechanism using Tantivy to provide full‑text and JSON search, storage‑compute separation, and significant performance gains over traditional Elasticsearch and Iceberg connectors.

Archer EngineBig DataInverted Index
0 likes · 9 min read
Archer Engine: Integrating Inverted Index with Iceberg for Scalable Big Data Log Analytics
DataFunSummit
DataFunSummit
Apr 22, 2024 · Big Data

Intelligent Optimization of Bilibili’s Iceberg‑Based Lakehouse for Query Acceleration

This article describes Bilibili’s intelligent optimization project that automatically analyzes historical query workloads to configure multi‑dimensional sorting, various indexes, and pre‑aggregation on Iceberg tables, thereby reducing scan volume by 28% across dozens of tables and improving OLAP query latency.

Big DataData WarehouseIceberg
0 likes · 15 min read
Intelligent Optimization of Bilibili’s Iceberg‑Based Lakehouse for Query Acceleration
DataFunSummit
DataFunSummit
Feb 29, 2024 · Big Data

Trino at Xiaomi: Architecture, Practices, and Future Plans

This article details Xiaomi’s practical deployment of Trino, covering its architectural role, core and extended capabilities, performance comparisons, integration with Iceberg and Spark, operational enhancements, multi‑cluster and ad‑hoc query scenarios, future cloud‑storage plans, and a Q&A session.

Big DataIcebergOLAP
0 likes · 20 min read
Trino at Xiaomi: Architecture, Practices, and Future Plans
Sohu Tech Products
Sohu Tech Products
Dec 13, 2023 · Big Data

Alluxio Edge: Edge Caching Solution for Trino and PrestoDB

Alluxio Edge is a library that runs inside Trino or PrestoDB workers, using local SSD or memory to cache data from cloud storage, which restores data locality, cuts storage egress, and delivers up to ten‑fold IO speed gains and up to ten‑fold query performance improvements in real deployments.

Alluxio EdgeBig DataData Locality
0 likes · 14 min read
Alluxio Edge: Edge Caching Solution for Trino and PrestoDB
DataFunSummit
DataFunSummit
Sep 25, 2023 · Big Data

Trino in Bilibili Lakehouse: Compute Engine, Stability, and Containerization Practices

This article presents Bilibili's practical implementation of Trino within a lakehouse architecture, focusing on the compute engine placement, stability enhancements, and containerized deployment, while detailing indexing strategies, pre‑computation techniques, Iceberg metadata optimizations, and performance gains for large‑scale analytical queries.

ContainerizationIcebergIndexing
0 likes · 14 min read
Trino in Bilibili Lakehouse: Compute Engine, Stability, and Containerization Practices
iQIYI Technical Product Team
iQIYI Technical Product Team
Aug 25, 2023 · Big Data

Venus Log Platform Architecture Evolution: From ELK to Data Lake

The Venus log platform at iQiyi migrated from an ElasticSearch‑Kibana architecture to an Iceberg‑based data lake with Trino, cutting storage and compute costs by over 70%, boosting stability by 85%, and efficiently supporting billions of daily logs through write‑heavy, low‑query workloads.

Big DataElasticsearchIceberg
0 likes · 22 min read
Venus Log Platform Architecture Evolution: From ELK to Data Lake
政采云技术
政采云技术
Jul 27, 2023 · Big Data

Developing and Deploying Custom Trino Plugins (Access Control Example)

This article explains how to develop, package, and deploy custom Trino plugins—illustrated with an access‑control plugin—by using Java SPI, Maven dependencies, implementing the Plugin and SystemAccessControl interfaces, and configuring the plugin in Trino’s configuration files.

Access ControlBig DataJava SPI
0 likes · 11 min read
Developing and Deploying Custom Trino Plugins (Access Control Example)
Inke Technology
Inke Technology
Jun 28, 2023 · Big Data

Extending Apache Seatunnel for Trino and Kyuubi Integration: A Practical Guide

This article outlines the challenges of scaling data integration platforms, proposes a comprehensive solution using Apache Seatunnel and Dinky, details the implementation of Trino and Kyuubi JDBC support, and describes the platform's architecture, task publishing workflow, logging, monitoring, resource management, and future enhancements.

Apache SeatunnelBig DataKyuubi
0 likes · 16 min read
Extending Apache Seatunnel for Trino and Kyuubi Integration: A Practical Guide
Bilibili Tech
Bilibili Tech
Jun 20, 2023 · Big Data

Design and Evolution of Bilibili's Billions 3.0 Log Platform: A Lakehouse Architecture with ClickHouse, Iceberg, and Trino

Bilibili evolved its log platform from ClickHouse‑based Billions 2.0 to Billions 3.0 lakehouse using Iceberg, HDFS, Trino, retaining ClickHouse for acceleration; this reduces storage cost by over 20%, improves observability, solves the compute‑storage mismatch, adds flexible indexing, and supports complex ETL while staying open‑source.

Big DataClickHouseIceberg
0 likes · 36 min read
Design and Evolution of Bilibili's Billions 3.0 Log Platform: A Lakehouse Architecture with ClickHouse, Iceberg, and Trino
DataFunTalk
DataFunTalk
Jun 5, 2023 · Cloud Computing

Comcast Hybrid Cloud Data Platform Case Study: Seamless and Secure Data Access with Alluxio

Comcast’s hybrid‑cloud data platform, built on Trino and Amazon S3, faced challenges such as fragmented data access, costly data copies, and latency, leading the DX team to adopt Alluxio as a unified, cache‑enabled, secure middle‑layer that bridges storage and compute.

AlluxioAmazon S3Data Access
0 likes · 3 min read
Comcast Hybrid Cloud Data Platform Case Study: Seamless and Secure Data Access with Alluxio
DataFunTalk
DataFunTalk
Jun 2, 2023 · Big Data

Iceberg Data Lake Implementation and Optimization at iQIYI

This article details iQIYI's adoption of the Iceberg data lake, covering its OLAP architecture, reasons for a lake, Iceberg table format advantages over Hive, platform construction, extensive performance optimizations, and real‑world business use cases such as ad‑flow unification, log analysis, audit, and CDC pipelines.

Big DataFlinkIceberg
0 likes · 18 min read
Iceberg Data Lake Implementation and Optimization at iQIYI
DataFunTalk
DataFunTalk
Mar 12, 2023 · Big Data

Apache Kyuubi 1.6.0 Feature Overview and Enhancements

The article provides a comprehensive walkthrough of Apache Kyuubi 1.6.0, detailing server‑side enhancements such as batch (JAR) task submission, metadata store and unified API/authentication, client‑side improvements to the built‑in JDBC driver and Beeline, as well as engine plugins for Spark, Flink, Trino and Hive, and concludes with the community’s roadmap and statistics.

Apache KyuubiBig DataFlink
0 likes · 12 min read
Apache Kyuubi 1.6.0 Feature Overview and Enhancements
DataFunSummit
DataFunSummit
Oct 12, 2022 · Big Data

Practical Application of Kyuubi in Xiaomi’s Big Data Platform

This article details how Xiaomi integrated the open‑source Kyuubi SQL gateway into its evolving big‑data platform, describing the challenges of multiple SQL services, the architectural redesign for a unified, high‑availability service, performance gains, new features such as engine pooling and Z‑ordering, and future roadmap plans.

Big DataHigh AvailabilityKyuubi
0 likes · 15 min read
Practical Application of Kyuubi in Xiaomi’s Big Data Platform
Bilibili Tech
Bilibili Tech
Sep 30, 2022 · Big Data

Bilibili's Efficient Lakehouse Platform Built on Trino and Iceberg

Bilibili’s new lake‑house platform, built on Trino and Iceberg, replaces Hive‑based pipelines by ingesting logs and DB data into Iceberg tables, applying advanced sorting, Z‑order/Hilbert clustering, bitmap and bloom indexes, virtual join columns and pre‑aggregation, enabling 70 000 daily queries on 2 PB with average scans of 2 GB and sub‑2‑second response times.

Big DataData SkippingIceberg
0 likes · 15 min read
Bilibili's Efficient Lakehouse Platform Built on Trino and Iceberg
DataFunTalk
DataFunTalk
Aug 4, 2022 · Big Data

Kyuubi Application Practice on Xiaomi's Big Data Platform

This talk presents the end‑to‑end deployment of Kyuubi as a unified, high‑availability SQL gateway on Xiaomi’s big‑data platform, covering its integration, architecture upgrades, multi‑engine support, performance gains, operational improvements, and future roadmap.

Big DataKyuubiSQL Gateway
0 likes · 16 min read
Kyuubi Application Practice on Xiaomi's Big Data Platform