Alluxio Edge: Edge Caching Solution for Trino and PrestoDB
Alluxio Edge is a library that runs inside Trino or PrestoDB workers, using local SSD or memory to cache data from cloud storage, which restores data locality, cuts storage egress, and delivers up to ten‑fold IO speed gains and up to ten‑fold query performance improvements in real deployments.
This article introduces Alluxio Edge, an edge caching solution for Trino and PrestoDB. The presentation covers the background of Alluxio Edge, which emerged due to the decoupling of compute and storage in modern data technology stacks, leading to loss of data locality and increased cloud storage egress costs.
Alluxio Edge is a library that runs within PrestoDB or Trino processes, utilizing local storage (SSD or memory) for data caching. It addresses three main challenges: IO being the primary performance bottleneck, performance fluctuations from storage systems like HDFS affecting query engine IO, and network resource consumption from distributed computing operations.
The reference architecture shows a one-to-one mapping between Trino workers and Alluxio Edge instances. When Trino accesses data from S3 or other storage systems, Alluxio Edge automatically caches data locally. Testing showed 1.5x to 10x end-to-end query performance improvement, and 10x to 50x IO speed improvement on IO-only queries. Cloud storage API calls were reduced by 50% to 90%.
Key features include: local SSD/memory caching, support for multiple data lake connectors (Iceberg, Hudi, DeltaLake, Hive), flexible cache eviction policies (LRU, FIFO, TTL), and data quota functionality. Technical challenges addressed include data consistency (using page versioning), data locality (soft affinity and consistent hashing), and cache utilization (filtering strategies).
Real-world deployments include Uber's deployment across 15,000 nodes in three clusters, achieving 50% end-to-end performance improvement, 10% reduction in HDFS read traffic, 80% avoidance of GCS read requests, and P90 latency reduction from 228 seconds to 50 seconds.
Sohu Tech Products
A knowledge-sharing platform for Sohu's technology products. As a leading Chinese internet brand with media, video, search, and gaming services and over 700 million users, Sohu continuously drives tech innovation and practice. We’ll share practical insights and tech news here.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.