Alibaba Cloud Infrastructure
Author

Alibaba Cloud Infrastructure

For uninterrupted computing services

353
Articles
0
Likes
936
Views
0
Comments
Recent Articles

Latest from Alibaba Cloud Infrastructure

100 recent articles max
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Mar 16, 2026 · Artificial Intelligence

Scaling Agentic Reinforcement Learning with a Decoupled T‑Architecture Using Verl and Argo Workflows

Agentic reinforcement learning is evolving from simple text generation to complex, scalable agents, but large‑scale deployment faces challenges like massive parallel rollout scheduling and reproducible environments; this article presents a decoupled T‑architecture that separates high‑level RL logic (Verl) from execution orchestration (Argo Workflows) to address these issues.

Agentic RLArgo WorkflowsScalable Reinforcement Learning
0 likes · 10 min read
Scaling Agentic Reinforcement Learning with a Decoupled T‑Architecture Using Verl and Argo Workflows
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Mar 13, 2026 · Cloud Native

Boosting Autonomous Driving Data Pipelines with Koordinator’s ElasticQuota and GPU Sharing

This article details how a leading autonomous‑driving company tackled multi‑tenant resource contention, low GPU utilization, and distributed task dead‑locks on a heterogeneous Kubernetes cluster by adopting Koordinator’s ElasticQuota, Reservation, Gang and Device‑Share features, achieving higher allocation rates, better fairness, and significantly improved GPU efficiency.

ElasticQuotaGPU SharingKoordinator
0 likes · 20 min read
Boosting Autonomous Driving Data Pipelines with Koordinator’s ElasticQuota and GPU Sharing
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Feb 12, 2026 · Cloud Native

How to Seamlessly Move AI Data Between OSS and CPFS with Kubernetes VolumePopulator

This article explains how Kubernetes VolumePopulator can automatically transfer AI training data from low‑cost OSS storage to high‑performance CPFS volumes, enabling on‑demand model loading, cost‑effective hot‑cold data management, and fully automated lifecycle handling in cloud‑native AI workloads.

AI trainingCPFSCloud Native Storage
0 likes · 9 min read
How to Seamlessly Move AI Data Between OSS and CPFS with Kubernetes VolumePopulator
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Feb 9, 2026 · Cloud Native

Eliminate Data Bottlenecks in Large‑Scale Argo Workflows with VolumePopulator

By integrating Alibaba Cloud ACK’s Kubernetes VolumePopulator with Argo Workflows, this guide shows how to pre‑populate independent high‑performance volumes for each parallel task, eliminating I/O contention, ensuring data isolation, and enabling scalable, serverless‑accelerated pipelines for large‑scale data processing.

Alibaba Cloud ACKArgo WorkflowsKubernetes
0 likes · 11 min read
Eliminate Data Bottlenecks in Large‑Scale Argo Workflows with VolumePopulator