Tagged articles
6 articles
Page 1 of 1
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Apr 2, 2026 · Cloud Native

How Kthena Enables Production‑Grade LLM Inference on Kubernetes

This article analyzes the cloud‑native challenges of deploying large‑model inference on Kubernetes and presents Kthena’s architecture—ModelServing, Router, Autoscaler, and ModelBooster—along with Volcano integration, vLLM‑Ascend setup, and a real‑world Qwen3‑235B deployment case, highlighting performance gains and future directions.

Cloud NativeKthenaKubernetes
0 likes · 13 min read
How Kthena Enables Production‑Grade LLM Inference on Kubernetes
dbaplus Community
dbaplus Community
Feb 9, 2026 · Artificial Intelligence

How EffectiveGPU Cuts GPU Costs with Fine‑Grained Partitioning and Volcano Scheduling

This article details how SF Tech's EffectiveGPU (EGPU) platform redesigns GPU resource management on Kubernetes, introducing fine‑grained memory and compute partitioning, priority‑based scheduling, Volcano integration, and monitoring pipelines to dramatically improve utilization and reduce hardware costs for AI workloads.

AI PlatformGPUGPU partitioning
0 likes · 23 min read
How EffectiveGPU Cuts GPU Costs with Fine‑Grained Partitioning and Volcano Scheduling
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
May 15, 2025 · Cloud Native

How 360’s AI Platform Boosted GPU Utilization with Volcano Scheduler

360’s AI platform migrated its GPU clusters to a cloud‑native architecture and adopted the Volcano scheduler, achieving over 45% GPU utilization, less than 7% fragmentation, and more than 1000000 scheduled Pods, while leveraging flexible plugins, hierarchical queues, and resource pooling to optimize AI and big‑data workloads.

AI PlatformGPU schedulingKubernetes
0 likes · 13 min read
How 360’s AI Platform Boosted GPU Utilization with Volcano Scheduler
Efficient Ops
Efficient Ops
May 22, 2022 · Cloud Native

How to Run Multiple Containers Sequentially in a Single Kubernetes Pod

This article explains how to execute several containers one after another within a single Kubernetes pod by leveraging initContainers and native Job mechanisms, compares alternative solutions such as Volcano and Argo, provides complete YAML examples, and discusses practical considerations like volume sharing, security contexts, and timeout settings.

ArgoJobKubernetes
0 likes · 9 min read
How to Run Multiple Containers Sequentially in a Single Kubernetes Pod
MaGe Linux Operations
MaGe Linux Operations
Aug 25, 2020 · Cloud Native

How to Run Multiple Containers Sequentially in a Single Kubernetes Pod

This article explains why native Kubernetes Jobs run containers concurrently, then shows how to achieve true sequential execution within a single pod using initContainers, and compares three approaches—native Job, Volcano, and Argo—detailing configurations, code samples, and practical trade‑offs.

ArgoJobKubernetes
0 likes · 9 min read
How to Run Multiple Containers Sequentially in a Single Kubernetes Pod