Cloud Native 20 min read

How Fluid + JuiceFSRuntime Powers Scalable Cloud‑Native Quantitative Research

This article explains how Metabit Trading built a cloud‑native quantitative research platform using Fluid and JuiceFSRuntime to achieve elastic compute, high‑throughput data caching, and cost‑effective scaling for AI‑driven trading strategies.

Alibaba Cloud Native

Feb 17, 2023

How Fluid + JuiceFSRuntime Powers Scalable Cloud‑Native Quantitative Research

Background

Advances in machine learning, cloud computing, and cloud‑native technologies enable quantitative finance teams to ingest both structured market data and low‑signal unstructured data (research reports, news, social media) for AI‑driven strategy research.

Challenges for Machine‑Learning‑Based Quant Research

Traditional quant pipelines handle only price, volume, and return series. Adding unstructured data introduces noise, bursty workloads, high concurrency, and limited compute resources, requiring elastic data‑caching and fine‑grained access control.

Platform Requirements

Elastic handling of sudden high‑volume tasks.

Elastic data‑cache throughput for hot market data (hundreds of Gbps).

Linear scalability of capacity and throughput.

Data‑affinity scheduling to reuse local caches.

IP protection with isolated data access.

Intermediate‑result caching for feature pipelines.

Support for multiple file systems (OSS, CPFS, NAS, JuiceFS).

Solution Overview

Metabit adopted Fluid (CNCF sandbox) together with JuiceFSRuntime . Fluid abstracts data usage as a Dataset instead of a generic Persistent Volume Claim, allowing per‑access‑pattern features (read‑only, read‑write, small‑file) and lifecycle management. JuiceFSRuntime provides a distributed POSIX‑compatible cache that integrates with Fluid’s autoscaling, portability, observability, and affinity scheduling.

Architecture

Fluid creates a Dataset that describes the data access pattern. JuiceFSRuntime implements the caching layer, exposing a POSIX interface while using JuiceFS as the cloud‑storage backend. The whole stack runs on Kubernetes, leveraging native scheduling and resource management.

Key Features of the Dataset Abstraction

Performance tuning per access pattern : read‑only for model training, read‑write for feature generation.

Data isolation : Kubernetes namespaces map each Dataset to a distinct JuiceFS sub‑directory.

Cache sharing : Public datasets are cached once and reused across teams.

Runtime Configuration and Elastic Scaling

Resources for JuiceFSRuntime (CPU, memory, network, worker count) are tuned per Dataset. The runtime supports manual, automatic, and scheduled scaling policies.

apiVersion: data.fluid.io/v1alpha1
kind: Dataset
metadata:
  name: metabit-juice-research
spec:
  mounts:
    - name: metabit-juice-research
      mountPoint: juicefs:///
      options:
        metacache: ""
        cache-group: "research-groups"
      encryptOptions:
        - name: token
          valueFrom:
            secretKeyRef:
              name: juicefs-secret
              key: token
        - name: access-key
          valueFrom:
            secretKeyRef:
              name: juicefs-secret
              key: access-key
        - name: secret-key
          valueFrom:
            secretKeyRef:
              name: juicefs-secret
              key: secret-key
  nodeAffinity:
    required:
      nodeSelectorTerms:
        - matchExpressions:
            - key: node.kubernetes.io/instance-type
              operator: In
              values:
                - ecs.g7.8xlarge
                - ecs.g7.16xlarge
  tolerations:
    - key: jfs_transmittion
      operator: Exists
      effect: NoSchedule
---
apiVersion: data.fluid.io/v1alpha1
kind: JuiceFSRuntime
metadata:
  name: metabit-juice-research
spec:
  replicas: 5
  tieredstore:
    levels:
      - mediumtype: MEM
        path: /dev/shm
        quota: 40960
        low: "0.1"
  worker:
    nodeSelector:
      nodeType: cacheNode
    options:
      cache-size: 409600
      free-space-ratio: "0.15"

Scaling policies are expressed with a CronHorizontalPodAutoscaler:

apiVersion: autoscaling.alibabacloud.com/v1beta1
kind: CronHorizontalPodAutoscaler
metadata:
  name: research-weekly
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: data.fluid.io/v1alpha1
    kind: JuiceFSRuntime
    name: metabit-juice-research
  jobs:
    - name: "scale-down"
      schedule: "0 0 7 ? * 1"
      targetSize: 10
    - name: "scale-up"
      schedule: "0 0 18 ? * 5-6"
      targetSize: 20

Performance Evaluation

Using 20 ecs.g7.8xlarge nodes (25 Gbps each) as cache workers, latency was measured under varying pod concurrency. With few pods, Fluid showed little benefit; with 100 concurrent pods, Fluid reduced average latency by >40 % compared with a traditional distributed storage setup, yielding faster task completion and lower ECS cost.

Conclusion

Production use of Fluid + JuiceFSRuntime demonstrates that cloud‑native elastic data caching satisfies the high‑throughput, elastic, and data‑affinity requirements of AI‑driven quantitative research. The approach delivers higher performance, cost savings, and a flexible, observable platform that scales with workload demand.

References

Fluid project repository:

https://github.com/fluid-cloudnative/fluid

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Cloud Native Kubernetes Elastic Scaling JuiceFS Data Caching Fluid Quantitative Research

Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.