Tag

Alluxio

1 views collected around this technical thread.

iQIYI Technical Product Team
iQIYI Technical Product Team
Nov 21, 2024 · Big Data

Alluxio Integration and Optimization for Multi‑AZ Big Data Analytics at iQIYI

iQIYI integrates Alluxio with its QBFS multi‑AZ unified scheduling system, automatically caching hot tables, applying table‑level policies, page‑level storage and AZ‑aware worker selection, which together cut cross‑zone traffic, halve query latency, achieve up to 20× I/O speedup and a three‑fold overall performance boost.

AlluxioBig DataCache Optimization
0 likes · 23 min read
Alluxio Integration and Optimization for Multi‑AZ Big Data Analytics at iQIYI
DataFunSummit
DataFunSummit
Jul 23, 2024 · Big Data

Multi-Cloud Unified Data Acceleration Layer at Xiaohongshu: Challenges, Alluxio Solution, and Performance Gains

This article presents Xiaohongshu's multi‑cloud unified data acceleration layer built with Alluxio, detailing the challenges of multi‑cloud architectures, the design goals, Alluxio's architecture and features, real‑world case studies in AI training and recommendation indexing, performance improvements, and future plans.

AI trainingAlluxioBig Data
0 likes · 22 min read
Multi-Cloud Unified Data Acceleration Layer at Xiaohongshu: Challenges, Alluxio Solution, and Performance Gains
DataFunSummit
DataFunSummit
May 24, 2024 · Big Data

Ctrip's Experience with Alluxio in Its Big Data Platform: Architecture, Transparent Access, Custom Authentication, CallerContext, and Dynamic Configuration

This article details how Ctrip, a leading travel company, leverages Alluxio as a distributed cache within its extensive big‑data infrastructure to improve data access speed, implement transparent storage access, support custom authentication and multi‑tenant features, enhance audit logging with CallerContext, and dynamically distribute client configurations via Kyuubi.

AlluxioBig DataCallerContext
0 likes · 14 min read
Ctrip's Experience with Alluxio in Its Big Data Platform: Architecture, Transparent Access, Custom Authentication, CallerContext, and Dynamic Configuration
DataFunTalk
DataFunTalk
May 21, 2024 · Big Data

Applying Alluxio to Autonomous Driving Model Training: Deployment, Performance, and Operational Insights

This article details how Alluxio was adopted to replace NAS in autonomous driving model training, describing the data closed‑loop workflow, the challenges of the previous system, Alluxio's architectural benefits, deployment strategies across single and multiple data centers, functional and performance testing, operational tuning, and the resulting cost and efficiency gains.

Alluxioautonomous drivingdata pipeline
0 likes · 15 min read
Applying Alluxio to Autonomous Driving Model Training: Deployment, Performance, and Operational Insights
DataFunTalk
DataFunTalk
May 14, 2024 · Cloud Computing

Hybrid Cloud Architecture and AI Storage Evolution at Zhihu: From UnionStore to Alluxio

This article describes Zhihu's hybrid cloud architecture—including offline, online, and GPU data centers—its self‑built UnionStore cache, the performance and latency challenges faced during large‑scale AI model training, and the subsequent evaluation and migration to Alluxio community and enterprise editions to achieve higher throughput, stability, and lower operational overhead.

AI StorageAlluxioBig Data
0 likes · 14 min read
Hybrid Cloud Architecture and AI Storage Evolution at Zhihu: From UnionStore to Alluxio
DataFunSummit
DataFunSummit
May 5, 2024 · Big Data

Alluxio in Lakehouse Architecture: Benefits, Challenges, and Real‑World Use Cases

This article explains how Alluxio enables a unified lake‑warehouse architecture by decoupling compute and storage, outlines its core capabilities, evaluates the cost‑saving and performance benefits, discusses the technical challenges, and presents several practical deployment scenarios in finance and AI workloads.

AlluxioBig DataCloud Native
0 likes · 15 min read
Alluxio in Lakehouse Architecture: Benefits, Challenges, and Real‑World Use Cases
DataFunTalk
DataFunTalk
Feb 18, 2024 · Cloud Computing

Research on the Unified Storage Platform for the Supercomputing Internet

This article presents a comprehensive overview of the challenges, key technologies, and future applications of a unified storage platform built on Alluxio for China's national supercomputing internet, detailing its architecture, data flow strategies, deployment status, and industry use cases across multiple sectors.

AlluxioHigh Performance ComputingSupercomputing
0 likes · 13 min read
Research on the Unified Storage Platform for the Supercomputing Internet
DataFunTalk
DataFunTalk
Feb 9, 2024 · Big Data

Alluxio’s Role in Lakehouse Architecture: Benefits, Challenges, and Real‑World Use Cases

This article explains how Alluxio enables lake‑warehouse integration by providing a data orchestration layer that caches data near compute, reduces storage‑compute separation costs, improves performance, and addresses challenges such as security, scalability, and multi‑cloud deployment, illustrated with several industry case studies.

AIAlluxioBig Data
0 likes · 16 min read
Alluxio’s Role in Lakehouse Architecture: Benefits, Challenges, and Real‑World Use Cases
DataFunTalk
DataFunTalk
Feb 3, 2024 · Big Data

Alluxio: Introduction, Architecture, and Practical Experience for Big Data Construction

This article introduces Alluxio as an open‑source data orchestration layer, explains its architecture and core features such as unified namespace, caching strategies, and cloud‑native deployment, and shares practical experiences on using Alluxio to simplify data lakehouse construction, migration, and hot‑cold data separation in complex big‑data environments.

AlluxioBig DataData Lakehouse
0 likes · 13 min read
Alluxio: Introduction, Architecture, and Practical Experience for Big Data Construction
DataFunTalk
DataFunTalk
Jan 14, 2024 · Big Data

Optimizing Object Storage and Impala Engine in NetEase NDH: Performance Enhancements and Feature Additions

This presentation outlines NetEase's NDH big‑data platform, detailing its background, object‑storage upload and rename optimizations, Impala engine adaptations—including file‑handle caching, transparent URI handling, and getFileBlockLocations improvements—and a suite of operational enhancements such as dynamic proxy user configuration and audit‑log extensions.

AlluxioBig DataImpala
0 likes · 14 min read
Optimizing Object Storage and Impala Engine in NetEase NDH: Performance Enhancements and Feature Additions
DataFunTalk
DataFunTalk
Nov 26, 2023 · Big Data

Data Orchestration in Hybrid Storage Architectures with Alluxio

This article explains how Alluxio, an open‑source data orchestration system, improves data access efficiency in hybrid multi‑cloud and multi‑storage environments by providing caching, a unified namespace, interface translation, automated data management, and federation capabilities for modern big‑data workloads.

AlluxioData FederationData Orchestration
0 likes · 18 min read
Data Orchestration in Hybrid Storage Architectures with Alluxio
DataFunSummit
DataFunSummit
Nov 18, 2023 · Artificial Intelligence

PyTorch Model Training Performance Tuning Guide with Alluxio

This guide explains how Ant Group uses Alluxio to overcome storage I/O, capacity, and latency challenges, delivering stability, performance, and scalability improvements for large‑scale PyTorch model training while reducing infrastructure costs and providing practical optimization techniques and code examples.

AIAlluxioPerformance Tuning
0 likes · 4 min read
PyTorch Model Training Performance Tuning Guide with Alluxio
DataFunTalk
DataFunTalk
Sep 9, 2023 · Big Data

Presto + Tencent DOP (Alluxio) Architecture and Optimization Practices for Financial OLAP

This article presents the practical implementation of Presto combined with Tencent DOP (Alluxio) in a financial OLAP scenario, detailing background and architectural evolution, the Presto‑Alluxio design, optimization techniques for caching, storage scalability, ORC handling, and performance results, followed by conclusions and future directions.

AlluxioBig DataOLAP
0 likes · 15 min read
Presto + Tencent DOP (Alluxio) Architecture and Optimization Practices for Financial OLAP
DataFunTalk
DataFunTalk
Jun 25, 2023 · Big Data

Multi‑Cloud Cache Evolution at Zhihu: From Multi‑HDFS to UnionStore to Alluxio

This technical presentation details Zhihu's journey in multi‑cloud caching, covering the motivations for a multi‑cloud architecture, the design and limitations of the self‑built UnionStore component, and the adoption of Alluxio to achieve significant performance, stability, and cost improvements across model serving and training workloads.

AlluxioBig DataMulti-Cloud
0 likes · 24 min read
Multi‑Cloud Cache Evolution at Zhihu: From Multi‑HDFS to UnionStore to Alluxio
DataFunTalk
DataFunTalk
Jun 16, 2023 · Cloud Native

Kubernetes Operator Deployment Challenges and Alluxio Operator Case Study

This article reviews the challenges of deploying applications on Kubernetes, introduces the operator concept as a mainstream solution, explains how to design and implement custom operators for services, and demonstrates these ideas with a detailed Alluxio Operator case study, including maturity levels and future enhancements.

AlluxioCloud NativeDeployment
0 likes · 17 min read
Kubernetes Operator Deployment Challenges and Alluxio Operator Case Study