Alibaba Cloud Native
Alibaba Cloud Native
Oct 20, 2023 · Cloud Native

How Knative Cuts AI Service Costs by 60% and Halves Deployment Time

This article explains how Shuhe Tech combined Knative with AI workloads to achieve 60% resource cost savings and reduce model deployment cycles from one day to half a day, detailing Knative's architecture, request‑based autoscaling, multi‑version releases, and advanced scaling features.

AIKPAKnative
0 likes · 19 min read
How Knative Cuts AI Service Costs by 60% and Halves Deployment Time
Alibaba Cloud Native
Alibaba Cloud Native
Sep 3, 2023 · Cloud Native

Master Knative’s Request‑Based Autoscaling: KPA, Scale‑to‑Zero, and Advanced Strategies

This article explains how Knative implements request‑based autoscaling with KPA, details the scale‑to‑zero mechanism, shows how to handle burst traffic using stable and panic windows, and demonstrates advanced extensions such as resource pools, precise MPA scaling, and predictive AHPA configurations with concrete YAML examples.

KPAKnativeKubernetes
0 likes · 18 min read
Master Knative’s Request‑Based Autoscaling: KPA, Scale‑to‑Zero, and Advanced Strategies
Alibaba Cloud Native
Alibaba Cloud Native
Apr 5, 2021 · Cloud Native

How Knative Enables Traffic‑Based Autoscaling and Gray Deployments

This article explains Knative’s traffic‑driven autoscaling and gray‑release capabilities, detailing the request flow architecture, the roles of Service, Configuration, Route and Revision, and walks through built‑in scaling strategies such as KPA, HPA, scheduled‑HPA, event‑gateway and custom plugins, with practical examples.

Gray DeploymentHPAKPA
0 likes · 10 min read
How Knative Enables Traffic‑Based Autoscaling and Gray Deployments