Tagged articles
4 articles
Page 1 of 1
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Jan 17, 2025 · Artificial Intelligence

Elastic Scaling of Large Language Model Inference on Alibaba Cloud ACK with Knative, ResourcePolicy, and Fluid

This article explains how to reduce inference cost and improve performance for large language models on Alibaba Cloud ACK by using Knative's request‑based autoscaling, custom ResourcePolicy priority scheduling, and Fluid data‑caching to achieve elastic scaling, resource pre‑emption, and faster model loading.

FluidInferenceKnative
0 likes · 22 min read
Elastic Scaling of Large Language Model Inference on Alibaba Cloud ACK with Knative, ResourcePolicy, and Fluid
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Dec 27, 2024 · Cloud Native

ElasticWorkload, WorkloadSpread, UnitedDeployment, and ResourcePolicy: Configurable Plugins for Serverless Elasticity in Alibaba Cloud Container Service

This article explains how Serverless elasticity is achieved in Alibaba Cloud Container Service by introducing four configurable plugins—ElasticWorkload, WorkloadSpread, UnitedDeployment, and ResourcePolicy—detailing their core capabilities, technical principles, advantages, real‑world use cases, and guidance for selecting the appropriate solution.

Cloud NativeElasticWorkloadKubernetes
0 likes · 30 min read
ElasticWorkload, WorkloadSpread, UnitedDeployment, and ResourcePolicy: Configurable Plugins for Serverless Elasticity in Alibaba Cloud Container Service
Alibaba Cloud Native
Alibaba Cloud Native
Jan 12, 2024 · Cloud Native

Unlock Second-Scale Elastic Scheduling with ACK Virtual Nodes

This article explains how to use Alibaba Cloud Container Service (ACK) virtual nodes and Elastic Container Instances (ECI) to achieve second‑scale elasticity, covering installation, ResourcePolicy configuration, zone‑aware scheduling, high‑availability setups, and performance results with concrete YAML examples.

ECIKubernetesResourcePolicy
0 likes · 12 min read
Unlock Second-Scale Elastic Scheduling with ACK Virtual Nodes