Big Data 9 min read

Alluxio 2.8 New Features Overview

This article summarizes the Alluxio 2.8 release, detailing enhancements in API support, enterprise‑grade security features, and data‑movement capabilities, while also covering new encryption options, master‑proxy S3 token handling, OPA integration, and various performance and observability optimizations.

DataFunTalk
DataFunTalk
DataFunTalk
Alluxio 2.8 New Features Overview

The content originates from Yang Yong's online Alluxio Meetup on June 26, where he presented "Alluxio 2.8 - New Features".

Key highlights of Alluxio 2.8:

Continuous improvement of Alluxio's API support.

Addition of enterprise‑level security features to meet corporate security requirements.

Enhanced data‑movement capabilities with new troubleshooting tools and asynchronous data‑move options.

1. Overview of Alluxio 2.8 features

The release focuses on five major areas: feature overview, API support improvements, new enterprise security features, data‑experience enhancements, other optimizations, and reference architecture.

2. Improved API support

Enhanced S3 API with metadata tagging (up to 10 tags per object) and support for operations such as CopyObject, DeleteObjectTagging, PutObjectTagging, etc.

POSIX API optimizations for AI/ML workloads, including read‑only cache non‑functional requirements (performance, capacity, stability).

Integration of libfuse3 API to enable future performance and scalability improvements; transition from libfuse2 to libfuse3.

Optimized mount/unmount mechanisms via CLI or Alluxio.site, with better handling of abnormal or residual mounts.

3. New enterprise‑grade security features

Server‑side data encryption that encrypts data on Alluxio Workers and decrypts on read, supporting directory‑level encryption zones, multiple keys, and nested zones.

Support for various encryption keys (Hashicorp Vault, JournalStore) and algorithms (AES/CBC, AES/GCM, AES/CTR, AES/ECB with/without padding).

Two‑step encryption activation: configure encryption policy and set encryption zones via command line.

4. Master‑proxy S3 token

All workers obtain S3 tokens through the Master, reducing duplicate token requests and enabling centralized token refresh.

5. OPA authorization integration

Alluxio can delegate mount authorization to OPA agents, typically deployed as sidecars, allowing OPA to handle access checks for various applications (Kubernetes, CI/CD, Service Mesh, SSH).

6. Data‑experience improvements

Asynchronous execution for distributedCp and distributedMv commands with CLI status querying.

Observability enhancements for EE scenarios, including policy execution status.

7. Other optimizations

Metadata sync metrics, improved exception messages for Data I/O errors, distributed command optimizations, technical debt reduction, system stability and capacity improvements, enhanced error handling, and a Stressbench tool.

The article also provides reference architecture diagrams and links to the Community and Enterprise release notes for Alluxio 2.8.

distributed systemsbig datasecurityAPIAlluxioData Orchestration
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.