Accelerate AI and Big Data Workloads on Kubernetes with Fluid’s JindoRuntime
Fluid is an open‑source Kubernetes‑native engine that orchestrates and accelerates distributed datasets for AI and big‑data workloads, and this guide explains its core concepts, the JindoRuntime implementation, performance benefits, and step‑by‑step instructions to deploy and test JindoRuntime on a K8s cluster.
What is Fluid?
Fluid is an open‑source, Kubernetes‑native distributed dataset orchestration and acceleration engine for cloud‑native, data‑intensive applications such as big‑data and AI workloads. It abstracts the data layer so that data can flow between storage systems (e.g., HDFS, OSS, Ceph) and compute workloads on Kubernetes, handling caching, replication, eviction, and transformation transparently.
Core concepts: Dataset and Runtime
A Dataset is a logical collection of related data that can be consumed by engines such as Spark or TensorFlow. Managing a Dataset involves security, versioning, and acceleration concerns.
Fluid introduces a Runtime abstraction to provide these capabilities. Currently Fluid supports two runtimes: AlluxioRuntime and JindoRuntime. The runtime defines lifecycle interfaces for security, version control, and data acceleration.
Key benefits
Data‑affinity scheduling and distributed caching accelerate data access for compute.
Namespace‑based isolation provides secure multi‑tenant data management.
Cross‑storage data federation reduces data‑island effects.
JindoRuntime
JindoRuntime is built on JindoFS , a proprietary Alibaba Cloud storage‑optimization engine for OSS that is fully compatible with the Hadoop FileSystem (HDFS) API. JindoFS offers two modes:
Block mode : Stores file blocks on OSS and optionally caches them locally, using a local namespace service for metadata.
Cache mode : Keeps the original OSS directory structure while providing client‑side caching and metadata acceleration; no data migration is required.
In Fluid, JindoRuntime uses JindoFS’s cache mode to access and cache remote OSS files. It can be deployed with a single Helm chart, supports STS credential‑free access, checksum verification, and client‑side encryption.
Advantages
Performance : Optimized OSS read/write paths and native‑layer enhancements deliver high throughput, especially for small files.
Rich distributed caching : Supports multi‑TB file caches and metadata caching, showing strong results in large‑scale AI training and data‑lake scenarios.
Security : STS token‑less access, Kubernetes secret integration, and checksum‑based data integrity.
Lightweight : Implemented in C++, adding minimal overhead to OSS access.
Performance snapshot
Using the ImageNet dataset on a Kubernetes cluster with the Arena benchmark, training ResNet‑50 with JindoRuntime (cache enabled) reduced training time by 76% compared with the open‑source OSSFS driver.
Quick start: Deploying JindoRuntime
The following steps assume a functional Kubernetes cluster with access to an Alibaba Cloud OSS bucket.
Create a namespace for Fluid: kubectl create ns fluid-system Download the Fluid release package (e.g., fluid-0.5.0.tgz).
Install Fluid with Helm, enabling JindoRuntime:
helm install --set runtime.jindo.enabled=true fluid fluid-0.5.0.tgzVerify the Fluid pods are running:
$ kubectl get pod -n fluid-system
NAME READY STATUS RESTARTS AGE
csi-nodeplugin-fluid-2mfcr 2/2 Running 0 108s
csi-nodeplugin-fluid-l7lv6 2/2 Running 0 108s
dataset-controller-5465c4bbf-5ds5p 1/1 Running 0 108s
jindoruntime-controller-654fb74447-cldsv 1/1 Running 0 108sThe number of csi-nodeplugin-fluid-xx pods should match the number of cluster nodes.
Create a Kubernetes Secret to store OSS credentials (replace xxx with your actual keys):
apiVersion: v1
kind: Secret
metadata:
name: mysecret
stringData:
fs.oss.accessKeyId: xxx
fs.oss.accessKeySecret: xxxApply the secret: kubectl create -f mysecret.yaml Define a Dataset CRD and a corresponding JindoRuntime CRD (replace placeholders with your OSS bucket information):
apiVersion: data.fluid.io/v1alpha1
kind: Dataset
metadata:
name: hadoop
spec:
mounts:
- mountPoint: oss://oss_bucket/bucket_dir
options:
fs.oss.endpoint: oss_endpoint
name: hadoop
encryptOptions:
- name: fs.oss.accessKeyId
valueFrom:
secretKeyRef:
name: mysecret
key: fs.oss.accessKeyId
- name: fs.oss.accessKeySecret
valueFrom:
secretKeyRef:
name: mysecret
key: fs.oss.accessKeySecret
---
apiVersion: data.fluid.io/v1alpha1
kind: JindoRuntime
metadata:
name: hadoop
spec:
replicas: 2
tieredstore:
levels:
- mediumtype: HDD
path: /mnt/disk1
quota: 100Gi
high: "0.99"
low: "0.8"Apply the resources: kubectl create -f resource.yaml Check the Dataset status to confirm caching:
$ kubectl get dataset hadoop
NAME UFS TOTAL SIZE CACHED CACHE CAPACITY CACHED PERCENTAGE PHASE AGE
hadoop 210MiB 0B 180GiB 0.0% Bound 1hCreate a simple application pod that mounts the Dataset to observe acceleration:
apiVersion: v1
kind: Pod
metadata:
name: demo-app
spec:
containers:
- name: demo
image: nginx
volumeMounts:
- mountPath: /data
name: hadoop
volumes:
- name: hadoop
persistentVolumeClaim:
claimName: hadoopDeploy the pod: kubectl create -f app.yaml Inside the pod, copy a 210 MiB file and measure time:
$ kubectl exec -it demo-app -- bash
$ time cp /data/hadoop/spark-3.0.1-bin-hadoop2.7.tgz /dev/null
real 0m18.386sAfter the first run, the file is cached locally. Re‑run the copy after recreating the pod:
$ time cp /data/hadoop/spark-3.0.1-bin-hadoop2.7.tgz /dev/null
real 0m0.048sThe second copy is ~300× faster, demonstrating JindoRuntime’s caching effect.
Cleanup the environment:
kubectl delete jindoruntime hadoop
kubectl delete dataset hadoop
kubectl delete -f app.yaml
kubectl delete secret mysecretFurther resources
Fluid project GitHub: https://github.com/fluid-cloudnative/fluid
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Native
We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
