Cloud Native 13 min read

Fluid 1.0 Release: Cloud‑Native Data Orchestration for AI and Big Data

Fluid 1.0 introduces a cloud‑native data orchestration platform that abstracts dataset management, affinity scheduling, custom data processing, and data flow pipelines for AI and big‑data workloads on Kubernetes, backed by extensive production testing, open‑source contributions, and a roadmap for future enhancements.

Alibaba Cloud Infrastructure

Jun 3, 2024

Fluid 1.0 Release: Cloud‑Native Data Orchestration for AI and Big Data

Benefiting from cloud‑native advantages in resource efficiency, deployment ease, and elastic compute, more enterprises run data‑intensive AI and big‑data applications in cloud‑native environments, but face latency and bandwidth challenges due to compute‑storage separation.

Kubernetes only provides traditional storage interfaces (CSI) without defining how applications efficiently use and manage data in containers, a gap that the Fluid open‑source project fills by introducing the Cloud‑Native Elastic Data abstraction (Dataset) as a first‑class citizen.

Fluid, initiated by Nanjing University, Alibaba Cloud Container Service, and the Alluxio community, offers Dataset CRUD, permission control, and access acceleration; after three years of development and CNCF incubation, version 1.0 is now stable.

Key v1.0 features:

1. Multi‑level data‑affinity scheduling – users can schedule tasks based on dataset locality (node, rack, zone, region) without deep knowledge of cache placement, with configurable label‑based policies.

2. Custom DataProcess and trigger strategies – a new DataProcess type lets data scientists define custom processing logic, with triggers such as once, onEvent, or Cron (e.g., run every 2 minutes).

apiVersion: data.fluid.io/v1alpha1
kind: DataLoad
metadata:
  name: cron-dataload
spec:
  dataset:
    name: demo
    namespace: default
  policy: Cron
  schedule: "*/2 * * * *" # Run every 2 min

3. DataFlow – an automated pipeline that chains data operations (DataLoad, DataMigrate, DataBackup, DataProcess) in a sequential order.

Example workflow (Python SDK):

flow = dataset.migrate(path="/data/", migrate_direction=constants.DATA_MIGRATE_DIRECTION_FROM)\
    .load("/data/1.txt")\
    .process(processor=create_processor(train))\
    .migrate(path="/data/", migrate_direction=constants.DATA_MIGRATE_DIRECTION_TO)
run = flow.run()

4. Vineyard object cache engine support – integrates the distributed memory engine Vineyard for efficient intermediate data sharing on Kubernetes.

Fluid’s testing suite includes unit, functional, compatibility, security, and real‑world scenario tests; it runs on thousands of Kubernetes clusters, supporting up to ten‑thousand‑node scales and handling thousands of datasets and AI workloads daily.

Production benchmarks show the webhook handling 125 QPS with 90 % latency < 25 ms, and controllers managing > 500 datasets per minute.

Additional improvements in v1.0 cover large‑scale Kubernetes stability, automatic FUSE mount recovery, and tightened RBAC permissions.

Future roadmap focuses on large‑model inference optimizations, adaptive data‑access mode selection based on scheduler decisions, and better developer‑experience for data‑source changes.

Thanks are extended to all contributors, community users, and adopters listed in the release notes.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

cloud-native AI kubernetes open-source Data Orchestration

Written by

Alibaba Cloud Infrastructure

For uninterrupted computing services

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.