Tagged articles
58 articles
Page 1 of 1
DataFunTalk
DataFunTalk
Feb 2, 2026 · Artificial Intelligence

How Alluxio Boosts GPU Utilization to 99.57% for Embodied AI – Inside the MLPerf Success

This article explains how Alluxio’s distributed caching architecture tackles the massive, multimodal data challenges of embodied AI, delivers near‑zero‑millisecond access, achieves 99.57% GPU utilization in MLPerf Storage v2.0, and validates its value through real‑world enterprise deployments.

AI Data PlatformAlluxioData Infrastructure
0 likes · 21 min read
How Alluxio Boosts GPU Utilization to 99.57% for Embodied AI – Inside the MLPerf Success
DataFunTalk
DataFunTalk
Sep 3, 2025 · Artificial Intelligence

How Alluxio’s Distributed Cache Boosts AI Training to 99.57% GPU Utilization

Alluxio’s distributed caching dramatically accelerates AI training and checkpointing workloads, achieving up to 99.57% GPU utilization and linear scaling across clusters in the MLPerf Storage v2.0 benchmark, while using cost‑effective commodity hardware to eliminate I/O bottlenecks.

AI trainingAlluxioGPU utilization
0 likes · 11 min read
How Alluxio’s Distributed Cache Boosts AI Training to 99.57% GPU Utilization
Bilibili Tech
Bilibili Tech
Aug 12, 2025 · Artificial Intelligence

How Bilibili Scaled AI Model Training with Alluxio Cache Acceleration

This article details Bilibili's multi-layer storage architecture and Alluxio‑based cache acceleration for large‑scale AI model training, covering challenges of high‑throughput, low‑latency file access, metadata scalability, fault tolerance, and the engineering solutions that boosted I/O performance up to ten‑fold.

AIAlluxioModel Training
0 likes · 24 min read
How Bilibili Scaled AI Model Training with Alluxio Cache Acceleration
iQIYI Technical Product Team
iQIYI Technical Product Team
Nov 21, 2024 · Big Data

Alluxio Integration and Optimization for Multi‑AZ Big Data Analytics at iQIYI

iQIYI integrates Alluxio with its QBFS multi‑AZ unified scheduling system, automatically caching hot tables, applying table‑level policies, page‑level storage and AZ‑aware worker selection, which together cut cross‑zone traffic, halve query latency, achieve up to 20× I/O speedup and a three‑fold overall performance boost.

AlluxioCache OptimizationData Lake
0 likes · 23 min read
Alluxio Integration and Optimization for Multi‑AZ Big Data Analytics at iQIYI
DataFunSummit
DataFunSummit
Jul 23, 2024 · Big Data

Multi-Cloud Unified Data Acceleration Layer at Xiaohongshu: Challenges, Alluxio Solution, and Performance Gains

This article presents Xiaohongshu's multi‑cloud unified data acceleration layer built with Alluxio, detailing the challenges of multi‑cloud architectures, the design goals, Alluxio's architecture and features, real‑world case studies in AI training and recommendation indexing, performance improvements, and future plans.

AI trainingAlluxioBig Data
0 likes · 22 min read
Multi-Cloud Unified Data Acceleration Layer at Xiaohongshu: Challenges, Alluxio Solution, and Performance Gains
DataFunSummit
DataFunSummit
May 24, 2024 · Big Data

Ctrip's Experience with Alluxio in Its Big Data Platform: Architecture, Transparent Access, Custom Authentication, CallerContext, and Dynamic Configuration

This article details how Ctrip, a leading travel company, leverages Alluxio as a distributed cache within its extensive big‑data infrastructure to improve data access speed, implement transparent storage access, support custom authentication and multi‑tenant features, enhance audit logging with CallerContext, and dynamically distribute client configurations via Kyuubi.

AlluxioBig DataCallerContext
0 likes · 14 min read
Ctrip's Experience with Alluxio in Its Big Data Platform: Architecture, Transparent Access, Custom Authentication, CallerContext, and Dynamic Configuration
DataFunTalk
DataFunTalk
May 21, 2024 · Big Data

Applying Alluxio to Autonomous Driving Model Training: Deployment, Performance, and Operational Insights

This article details how Alluxio was adopted to replace NAS in autonomous driving model training, describing the data closed‑loop workflow, the challenges of the previous system, Alluxio's architectural benefits, deployment strategies across single and multiple data centers, functional and performance testing, operational tuning, and the resulting cost and efficiency gains.

AlluxioModel Trainingautonomous driving
0 likes · 15 min read
Applying Alluxio to Autonomous Driving Model Training: Deployment, Performance, and Operational Insights
DataFunTalk
DataFunTalk
May 14, 2024 · Cloud Computing

Hybrid Cloud Architecture and AI Storage Evolution at Zhihu: From UnionStore to Alluxio

This article describes Zhihu's hybrid cloud architecture—including offline, online, and GPU data centers—its self‑built UnionStore cache, the performance and latency challenges faced during large‑scale AI model training, and the subsequent evaluation and migration to Alluxio community and enterprise editions to achieve higher throughput, stability, and lower operational overhead.

AI storageAlluxioBig Data
0 likes · 14 min read
Hybrid Cloud Architecture and AI Storage Evolution at Zhihu: From UnionStore to Alluxio
DataFunSummit
DataFunSummit
May 5, 2024 · Big Data

Alluxio in Lakehouse Architecture: Benefits, Challenges, and Real‑World Use Cases

This article explains how Alluxio enables a unified lake‑warehouse architecture by decoupling compute and storage, outlines its core capabilities, evaluates the cost‑saving and performance benefits, discusses the technical challenges, and presents several practical deployment scenarios in finance and AI workloads.

AlluxioBig DataData Orchestration
0 likes · 15 min read
Alluxio in Lakehouse Architecture: Benefits, Challenges, and Real‑World Use Cases
DataFunTalk
DataFunTalk
Feb 18, 2024 · Cloud Computing

Research on the Unified Storage Platform for the Supercomputing Internet

This article presents a comprehensive overview of the challenges, key technologies, and future applications of a unified storage platform built on Alluxio for China's national supercomputing internet, detailing its architecture, data flow strategies, deployment status, and industry use cases across multiple sectors.

AlluxioData FlowHigh‑performance computing
0 likes · 13 min read
Research on the Unified Storage Platform for the Supercomputing Internet
DataFunTalk
DataFunTalk
Feb 9, 2024 · Big Data

Alluxio’s Role in Lakehouse Architecture: Benefits, Challenges, and Real‑World Use Cases

This article explains how Alluxio enables lake‑warehouse integration by providing a data orchestration layer that caches data near compute, reduces storage‑compute separation costs, improves performance, and addresses challenges such as security, scalability, and multi‑cloud deployment, illustrated with several industry case studies.

AIAlluxioBig Data
0 likes · 16 min read
Alluxio’s Role in Lakehouse Architecture: Benefits, Challenges, and Real‑World Use Cases
DataFunTalk
DataFunTalk
Feb 3, 2024 · Big Data

Alluxio: Introduction, Architecture, and Practical Experience for Big Data Construction

This article introduces Alluxio as an open‑source data orchestration layer, explains its architecture and core features such as unified namespace, caching strategies, and cloud‑native deployment, and shares practical experiences on using Alluxio to simplify data lakehouse construction, migration, and hot‑cold data separation in complex big‑data environments.

AlluxioBig DataData Lakehouse
0 likes · 13 min read
Alluxio: Introduction, Architecture, and Practical Experience for Big Data Construction
DataFunTalk
DataFunTalk
Jan 14, 2024 · Big Data

Optimizing Object Storage and Impala Engine in NetEase NDH: Performance Enhancements and Feature Additions

This presentation outlines NetEase's NDH big‑data platform, detailing its background, object‑storage upload and rename optimizations, Impala engine adaptations—including file‑handle caching, transparent URI handling, and getFileBlockLocations improvements—and a suite of operational enhancements such as dynamic proxy user configuration and audit‑log extensions.

AlluxioBig DataImpala
0 likes · 14 min read
Optimizing Object Storage and Impala Engine in NetEase NDH: Performance Enhancements and Feature Additions
DataFunTalk
DataFunTalk
Nov 26, 2023 · Big Data

Data Orchestration in Hybrid Storage Architectures with Alluxio

This article explains how Alluxio, an open‑source data orchestration system, improves data access efficiency in hybrid multi‑cloud and multi‑storage environments by providing caching, a unified namespace, interface translation, automated data management, and federation capabilities for modern big‑data workloads.

AlluxioData OrchestrationHybrid storage
0 likes · 18 min read
Data Orchestration in Hybrid Storage Architectures with Alluxio
DataFunSummit
DataFunSummit
Nov 18, 2023 · Artificial Intelligence

PyTorch Model Training Performance Tuning Guide with Alluxio

This guide explains how Ant Group uses Alluxio to overcome storage I/O, capacity, and latency challenges, delivering stability, performance, and scalability improvements for large‑scale PyTorch model training while reducing infrastructure costs and providing practical optimization techniques and code examples.

AIAlluxioPyTorch
0 likes · 4 min read
PyTorch Model Training Performance Tuning Guide with Alluxio
Programmer DD
Programmer DD
Sep 15, 2023 · Big Data

How Alluxio Manages Massive Metadata: Inode, Block, MountTable, and Worker Insights

This article examines Alluxio's open-source distributed file system, detailing the core types of metadata—inode, block, mount table, and worker—along with the mechanisms for their storage, management, and optimization in both HEAP and ROCKS modes, and provides practical configuration guidance for scaling large-scale data environments.

AlluxioBig DataDistributed File System
0 likes · 15 min read
How Alluxio Manages Massive Metadata: Inode, Block, MountTable, and Worker Insights
DataFunTalk
DataFunTalk
Sep 9, 2023 · Big Data

Presto + Tencent DOP (Alluxio) Architecture and Optimization Practices for Financial OLAP

This article presents the practical implementation of Presto combined with Tencent DOP (Alluxio) in a financial OLAP scenario, detailing background and architectural evolution, the Presto‑Alluxio design, optimization techniques for caching, storage scalability, ORC handling, and performance results, followed by conclusions and future directions.

AlluxioBig DataOLAP
0 likes · 15 min read
Presto + Tencent DOP (Alluxio) Architecture and Optimization Practices for Financial OLAP
DataFunTalk
DataFunTalk
Jun 25, 2023 · Big Data

Multi‑Cloud Cache Evolution at Zhihu: From Multi‑HDFS to UnionStore to Alluxio

This technical presentation details Zhihu's journey in multi‑cloud caching, covering the motivations for a multi‑cloud architecture, the design and limitations of the self‑built UnionStore component, and the adoption of Alluxio to achieve significant performance, stability, and cost improvements across model serving and training workloads.

AlluxioBig Datacaching
0 likes · 24 min read
Multi‑Cloud Cache Evolution at Zhihu: From Multi‑HDFS to UnionStore to Alluxio
DataFunTalk
DataFunTalk
Jun 16, 2023 · Cloud Native

Kubernetes Operator Deployment Challenges and Alluxio Operator Case Study

This article reviews the challenges of deploying applications on Kubernetes, introduces the operator concept as a mainstream solution, explains how to design and implement custom operators for services, and demonstrates these ideas with a detailed Alluxio Operator case study, including maturity levels and future enhancements.

AlluxioCloud NativeDeployment
0 likes · 17 min read
Kubernetes Operator Deployment Challenges and Alluxio Operator Case Study
DataFunTalk
DataFunTalk
May 25, 2023 · Artificial Intelligence

Optimizing Distributed Cache for Large-Scale Deep Learning Training with Alluxio and SiloD

This article examines the storage bottlenecks in large‑scale AI training, evaluates local‑disk and Alluxio‑based distributed caching strategies, proposes uniform cache eviction and replica‑aware global policies, and introduces the SiloD framework for coordinated compute‑storage scheduling to dramatically improve GPU utilization and overall cluster throughput.

AI trainingAlluxioCache Eviction
0 likes · 16 min read
Optimizing Distributed Cache for Large-Scale Deep Learning Training with Alluxio and SiloD
DataFunTalk
DataFunTalk
Feb 17, 2023 · Big Data

Tencent Alluxio (DOP) Deployment and Optimization in Financial Data Analytics

This article describes how Tencent's Alluxio-based Data Orchestration Platform (DOP) was applied to financial analytics, detailing the business background, challenges of large‑scale OLAP workloads, the Alluxio architecture and usage modes, performance results, and the series of optimizations and tuning performed to achieve significant speedups.

AlluxioBig DataData Orchestration
0 likes · 15 min read
Tencent Alluxio (DOP) Deployment and Optimization in Financial Data Analytics
DataFunTalk
DataFunTalk
Feb 15, 2023 · Big Data

Alluxio Deployment at Ant Group: Stability Building, Performance Optimization, and Scale‑up for Large‑Scale Model Training

This article summarizes how Ant Group introduced Alluxio to address storage I/O, capacity, and latency challenges in large‑scale model training, detailing stability improvements through worker‑register follower and master migration, performance gains via follower‑only reads, and horizontal scaling using metadata sharding and multi‑cluster deployment.

AlluxioBig DataModel Training
0 likes · 15 min read
Alluxio Deployment at Ant Group: Stability Building, Performance Optimization, and Scale‑up for Large‑Scale Model Training
DataFunTalk
DataFunTalk
Feb 12, 2023 · Big Data

Optimizing Bilibili Presto Cluster Query Performance with Alluxio and Local Cache

This article presents a comprehensive technical overview of Bilibili's Presto cluster architecture, the challenges of query performance on Hadoop, and the systematic optimizations—including Alluxio integration, local cache mechanisms, multi‑active coordinators, label‑based scheduling, and real‑time penalties—that together improve availability, stability, and latency for large‑scale analytics workloads.

AlluxioBig DataCache
0 likes · 23 min read
Optimizing Bilibili Presto Cluster Query Performance with Alluxio and Local Cache
DataFunTalk
DataFunTalk
Jan 19, 2023 · Big Data

Tencent Alluxio: Accelerating the Next Generation of Big Data and AI

This article presents a comprehensive overview of Tencent's Alluxio project, covering the evolution of big‑data architecture, recent Alluxio research progress, typical deployment cases, and future work, while highlighting performance improvements, integration with cloud and AI workloads, and community contributions.

AIAlluxioBig Data
0 likes · 21 min read
Tencent Alluxio: Accelerating the Next Generation of Big Data and AI
DataFunTalk
DataFunTalk
Jan 18, 2023 · Big Data

Five Major Trends Shaping Big Data, AI, and Cloud Industries in 2023

The article forecasts five key trends for 2023—including cloud cost optimization, multi‑cloud freedom, rapid AI model adoption, expanding data‑sharing ecosystems, and the convergence of data warehouses and lakes—highlighting how they will reshape the big data, artificial intelligence, and cloud landscapes.

AlluxioKubernetesdata sharing
0 likes · 6 min read
Five Major Trends Shaping Big Data, AI, and Cloud Industries in 2023
DataFunSummit
DataFunSummit
Jan 1, 2023 · Big Data

Shopee Data Infra Presentation: Storage Status, Acceleration, Serviceization, and Future Plans

The Shopee Data Infra talk details the current storage architecture, Presto‑based acceleration with Alluxio caching, service‑oriented storage solutions using Alluxio Fuse and S3 APIs, and outlines future enhancements for Spark/Hive integration and CSI/Fuse optimizations, providing a comprehensive view of large‑scale big data storage engineering.

AlluxioCache ManagerData Infrastructure
0 likes · 16 min read
Shopee Data Infra Presentation: Storage Status, Acceleration, Serviceization, and Future Plans
DataFunTalk
DataFunTalk
Nov 19, 2022 · Big Data

Improving Bilibili Offline Cluster Performance with Presto and Alluxio

This technical presentation explains how Bilibili reduced database pressure and query latency in its production environment by integrating Presto with Alluxio, detailing the offline cluster architecture, challenges of compute‑storage separation, caching strategies, consistency mechanisms, performance gains, and future work.

AlluxioCachePresto
0 likes · 17 min read
Improving Bilibili Offline Cluster Performance with Presto and Alluxio
DataFunTalk
DataFunTalk
Sep 4, 2022 · Big Data

Alluxio 2.8 New Features Overview

This article summarizes the Alluxio 2.8 release, detailing enhancements in API support, enterprise‑grade security features, and data‑movement capabilities, while also covering new encryption options, master‑proxy S3 token handling, OPA integration, and various performance and observability optimizations.

APIAlluxioData Orchestration
0 likes · 9 min read
Alluxio 2.8 New Features Overview
DataFunTalk
DataFunTalk
Aug 31, 2022 · Big Data

Alluxio Data Orchestration and Cache Acceleration in China Unicom: Use Cases and Performance Gains

This article presents Zhang Ce's detailed overview of Alluxio's deployment at China Unicom, covering cache acceleration, compute‑storage separation, mixed‑load workloads, and lightweight analysis, and demonstrates how these strategies dramatically improve performance, scalability, and cost efficiency for big data processing.

AlluxioCache AccelerationData Orchestration
0 likes · 19 min read
Alluxio Data Orchestration and Cache Acceleration in China Unicom: Use Cases and Performance Gains
DataFunSummit
DataFunSummit
Aug 21, 2022 · Big Data

Alluxio Stress Testing Methods and Practices

This article explains the purpose, sources, and manifestations of pressure in Alluxio, describes its built‑in stress testing framework, outlines how to run and configure stress tools, and provides guidance on result calculation, reporting, common issues, and debugging for effective performance evaluation.

AlluxioBig DataPerformance Evaluation
0 likes · 11 min read
Alluxio Stress Testing Methods and Practices
DataFunTalk
DataFunTalk
Aug 20, 2022 · Artificial Intelligence

Atlas Supercomputing Platform: Architecture, Alluxio‑Fluid Integration, and Performance Improvements for AI Workloads

The article presents CloudKnow's Atlas supercomputing platform, detailing its AI‑focused architecture, early storage and bandwidth challenges, the integration of Alluxio and Fluid for distributed caching, various business adaptations, and experimental results showing significant performance gains across speech denoising, image classification, large‑file processing, and speech recognition workloads.

AIAlluxioFluid
0 likes · 16 min read
Atlas Supercomputing Platform: Architecture, Alluxio‑Fluid Integration, and Performance Improvements for AI Workloads
DataFunTalk
DataFunTalk
Aug 8, 2022 · Artificial Intelligence

Accelerating Cloud Deep Learning Training with Alluxio: Overview, Usage Levels, and POSIX API Development

This article explains how Alluxio, an open‑source data abstraction layer, can accelerate cloud‑based deep‑learning training by providing POSIX‑compatible caching, simplifying data source integration, and offering three usage levels—from basic read‑through caching to full data‑as‑a‑service abstraction—backed by real‑world case studies and performance results.

AIAlluxioCloud Training
0 likes · 10 min read
Accelerating Cloud Deep Learning Training with Alluxio: Overview, Usage Levels, and POSIX API Development
DataFunTalk
DataFunTalk
Aug 1, 2022 · Big Data

Bilibili Lakehouse Integration: Iceberg and Alluxio Optimization Practices

This article details Bilibili's lakehouse implementation using Apache Iceberg and Alluxio, covering background challenges, architectural components, data organization techniques like Z‑order and bitmap indexes, performance benchmarks, and future optimization plans for large‑scale analytics.

AlluxioBitmap IndexIceberg
0 likes · 21 min read
Bilibili Lakehouse Integration: Iceberg and Alluxio Optimization Practices
DataFunTalk
DataFunTalk
Jun 25, 2022 · Big Data

Alluxio Metadata and Data Synchronization: Design, Implementation, and Optimization

This article provides a comprehensive overview of Alluxio's metadata and data synchronization mechanisms, covering its unified namespace, mounting strategies, consistency models, various write modes, read workflows, metadata sync techniques, performance optimizations, and recommended configurations for different deployment scenarios.

AlluxioData Consistencymetadata synchronization
0 likes · 26 min read
Alluxio Metadata and Data Synchronization: Design, Implementation, and Optimization
DataFunTalk
DataFunTalk
Jan 19, 2022 · Artificial Intelligence

Alluxio for AI and Machine Learning: Architecture, Optimizations, and Performance Evaluation

This article presents a comprehensive technical overview of Alluxio, covering its role as a distributed data orchestration layer for AI workloads, core features such as caching and unified namespace, performance challenges in large‑scale machine‑learning pipelines, and the extensive optimizations and testing performed at Tencent to achieve high throughput and scalability.

AIAlluxioCephFS
0 likes · 23 min read
Alluxio for AI and Machine Learning: Architecture, Optimizations, and Performance Evaluation
Big Data Technology Architecture
Big Data Technology Architecture
Jul 20, 2021 · Big Data

PB‑Level Ad‑hoc Query Practice with Flink: Threat Hunting Platform Architecture and IO‑Reducing Optimizations

This article details 360's Threat Hunting platform built on Flink, covering its evolution, architecture, block‑index design, Hilbert‑curve data ordering, like‑pushdown, join optimizations, Alluxio caching, and future plans for BI and multi‑user concurrency, all aimed at efficient PB‑scale data querying.

AlluxioBlock IndexFlink
0 likes · 18 min read
PB‑Level Ad‑hoc Query Practice with Flink: Threat Hunting Platform Architecture and IO‑Reducing Optimizations
Alibaba Cloud Native
Alibaba Cloud Native
Apr 2, 2021 · Cloud Native

How Fluid Turns Kubernetes into a High‑Performance Data Logistics System

This article explains how the open‑source Fluid project addresses the inefficiencies of data‑intensive AI and big‑data workloads in cloud‑native Kubernetes environments by introducing a data‑centric abstraction, dual orchestration mechanisms, and seamless integration with Alluxio to achieve faster, secure, and scalable data access.

AlluxioBig DataCloud Native
0 likes · 19 min read
How Fluid Turns Kubernetes into a High‑Performance Data Logistics System
Alibaba Cloud Native
Alibaba Cloud Native
Mar 5, 2021 · Artificial Intelligence

How Alluxio Supercharges Cloud Deep Learning: Benchmarks, Architecture, and Tuning

This article examines why accelerating cloud‑based deep learning is essential, presents benchmark results comparing GPU generations and distributed training, introduces Alluxio as a distributed memory‑level cache, details its architecture on Kubernetes, and offers concrete tuning strategies to overcome I/O bottlenecks and boost training performance.

AIAlluxioDeep Learning
0 likes · 16 min read
How Alluxio Supercharges Cloud Deep Learning: Benchmarks, Architecture, and Tuning
Tencent Cloud Developer
Tencent Cloud Developer
Dec 30, 2020 · Big Data

How Alluxio Boosts Tencent Cloud EMR: Cutting Bandwidth by 50% and Accelerating IO‑Intensive Workloads

This article analyzes the challenges of traditional monolithic big‑data architectures, explains how Tencent Cloud EMR integrates Alluxio for compute‑storage separation, presents detailed performance benchmarks showing 20‑50% bandwidth reduction and 5‑40% query speedup, and outlines the specific tuning measures applied.

AlluxioBig DataCompute-Storage Separation
0 likes · 10 min read
How Alluxio Boosts Tencent Cloud EMR: Cutting Bandwidth by 50% and Accelerating IO‑Intensive Workloads
Alibaba Cloud Native
Alibaba Cloud Native
Nov 16, 2020 · Cloud Native

What’s New in Fluid 0.4? DataLoad, Small‑File Boost, HDFS Support & Multi‑Dataset Deployment

Fluid 0.4 introduces a DataLoad custom resource for declarative data pre‑warming, enhances support for massive small‑file datasets, adds HDFS‑compatible access for Spark and other big‑data frameworks, and enables mixed‑deployment of multiple datasets on a single node, all backed by significant performance gains.

AIAlluxioBig Data
0 likes · 8 min read
What’s New in Fluid 0.4? DataLoad, Small‑File Boost, HDFS Support & Multi‑Dataset Deployment
Big Data Technology Architecture
Big Data Technology Architecture
Aug 15, 2020 · Big Data

Alluxio: Open‑Source Data Orchestration Platform – Overview, Benefits, Innovations, and Getting‑Started Resources

Alluxio is an open‑source, memory‑centric data orchestration layer that bridges compute frameworks such as Spark, Presto, and TensorFlow with diverse storage systems, offering high‑speed I/O, unified namespace, multi‑level caching, and easy deployment, while providing extensive documentation, download links, and community resources for rapid adoption.

AlluxioAnalyticsData Orchestration
0 likes · 7 min read
Alluxio: Open‑Source Data Orchestration Platform – Overview, Benefits, Innovations, and Getting‑Started Resources
Big Data Technology & Architecture
Big Data Technology & Architecture
Jul 11, 2020 · Big Data

Alluxio Tiered Metadata Management and Asynchronous Cache Eviction Implementation

The article explains Alluxio's tiered metadata management architecture, describing how the system separates hot and cold metadata into cached and persisted layers, and details the custom asynchronous eviction thread and cache implementation that replace Guava cache for efficient large‑scale metadata handling.

AlluxioCachedistributed storage
0 likes · 15 min read
Alluxio Tiered Metadata Management and Asynchronous Cache Eviction Implementation
Alibaba Cloud Native
Alibaba Cloud Native
May 12, 2020 · Artificial Intelligence

Boosting Cloud‑Native AI Training with Alluxio: Performance Tuning on Kubernetes

This article examines the challenges of large‑scale deep‑learning model training on Kubernetes, analyzes performance bottlenecks caused by Alluxio‑FUSE integration, and presents a series of configuration and system‑level optimizations that dramatically improve data‑access speed and overall training throughput.

AI trainingAlluxioCloud Native
0 likes · 22 min read
Boosting Cloud‑Native AI Training with Alluxio: Performance Tuning on Kubernetes
Architects' Tech Alliance
Architects' Tech Alliance
Jul 28, 2019 · Big Data

Alluxio: A Virtual Distributed File System for Unified Big Data Access and Cost‑Effective Storage

The article explains how Alluxio, a memory‑speed virtual distributed file system, acts as a virtual data lake to unify access to structured and unstructured big‑data across heterogeneous storage systems, offering on‑demand fast local access, intelligent caching, reduced storage costs, and enterprise‑grade security and fault tolerance.

AlluxioBig DataData Lake
0 likes · 15 min read
Alluxio: A Virtual Distributed File System for Unified Big Data Access and Cost‑Effective Storage
NetEase Game Operations Platform
NetEase Game Operations Platform
Dec 5, 2018 · Big Data

Presto + Alluxio Architecture for Interactive Ad‑hoc Queries in NetEase Game Data Warehouse

This article describes how NetEase Games built a Presto‑based interactive ad‑hoc query platform backed by Alluxio caching to achieve sub‑10‑second query latency, outlines the architectural design, performance comparisons with other Hadoop‑based solutions, encountered issues, and future improvement plans.

AlluxioBig DataPresto
0 likes · 10 min read
Presto + Alluxio Architecture for Interactive Ad‑hoc Queries in NetEase Game Data Warehouse
Architects' Tech Alliance
Architects' Tech Alliance
Nov 5, 2018 · Big Data

Alluxio as a Virtual Distributed File System for Data Lake Solutions

The article explains how Alluxio provides a virtual distributed file system that acts as a "virtual data lake," enabling unified, high‑performance access to structured and unstructured data across heterogeneous storage back‑ends while reducing storage costs through intelligent caching and eliminating the need for permanent data copies.

AlluxioBig DataData Lake
0 likes · 16 min read
Alluxio as a Virtual Distributed File System for Data Lake Solutions