Cloud Native 19 min read

Elastic Scaling in Serverless Cloud‑Native Applications

Elastic scaling, a cornerstone of Xianyu’s shift to serverless cloud-native architecture, leverages Kubernetes autoscaling components—Cluster‑Autoscaler, HPA, VPA—to dynamically adjust resources via reactive thresholds or predictive models, yet faces challenges like cold‑starts, lack of scale‑to‑zero, and optimal pod‑pool buffering, prompting ongoing research for faster, smarter, safer scaling.

Xianyu Technology

Dec 17, 2020

Elastic Scaling in Serverless Cloud‑Native Applications

Introduction: Xianyu's backend architecture is evolving toward cloud‑native/Serverless, offering automation, on‑demand loading, elastic scaling, strong isolation, and agile deployment, which reduces labor, risk, infrastructure costs, and delivery time. Elastic scaling is a key highlight of Serverless.

Basic concept: Elastic scaling addresses the mismatch between capacity planning and actual cluster load. When resources are insufficient, the cluster size or resource allocation is adjusted to maintain stability; when load is low, resources are reduced to avoid waste. Resource scaling can be horizontal (adding/removing nodes) or vertical (upgrading node capacity), with horizontal scaling preferred for most Serverless scenarios.

Scaling algorithms: Two main types exist—reactive algorithms based on resource‑threshold metrics (CPU, memory, etc.) and predictive algorithms that analyze historical performance data to forecast capacity needs. Most open‑source solutions rely on threshold‑based methods.

Kubernetes elastic components: Kubernetes provides standard autoscaling components, including Cluster‑Autoscaler (node scaling), Horizontal Pod Autoscaler (HPA) & Cluster‑Proportional‑Autoscaler (pod scaling), and Vertical Pod Autoscaler (VPA). HPA automatically adjusts pod replica counts based on CPU utilization or custom metrics via the Metrics API. The scaling formula is: desired replicas = ceil[current replicas × (current metric / target metric)].

Challenges in Kubernetes scaling: limited support for application‑level metrics, lack of Scale‑to‑Zero capability, difficulty in capacity planning, resource fragmentation across heterogeneous node types, and trade‑offs between high utilization and resource waste.

Xianyu Serverless challenges: runtime cold‑start latency (especially for Java/Spring applications), upstream dependency capacity assessment, peak traffic resource allocation, and the need for adequate pod‑pool buffering to avoid resource contention during traffic spikes.

Practice examples:

Cainiao elastic scheduling system: a three‑layer decision model (strategy, aggregation, execution) with policies for resource safety, optimization, timing, and service security.

Weibo automatic autoscaling: employs atomic API tasks, with capacity decisions triggered either on a schedule or by real‑time business metrics.

Alibaba Cloud container resource on‑demand solution: a policy engine that dynamically adjusts cgroup parameters, decouples Kubelet, and supports fine‑grained resource limits.

Alibaba CSE Serverless practices: includes cold‑start compression, hot‑copy startup (Fork2, CRIU‑based), and a comparison with AWS Lambda’s function‑centric model.

Conclusion: Current research and implementations provide mature solutions for resource pooling, elastic scheduling, capacity decision‑making, and automated operations based on Kubernetes. Nevertheless, achieving faster, smarter, and safer pod scaling—especially from zero to one—remains an open challenge.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance Serverless Kubernetes Auto Scaling Elastic Scaling Cloud-native

Written by

Xianyu Technology

Official account of the Xianyu technology team

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.