Tagged articles
74 articles
Page 1 of 1
MaGe Linux Operations
MaGe Linux Operations
Jan 18, 2026 · Artificial Intelligence

How to Deploy Scalable LLM Inference on Kubernetes with GPU Autoscaling

This guide walks through building a production‑grade Kubernetes GPU cluster for large language model inference, covering hardware sizing, GPU resource scheduling, model storage options, automated scaling with HPA, health checks, monitoring, troubleshooting, and multi‑model deployment strategies.

DockerGPUInference
0 likes · 49 min read
How to Deploy Scalable LLM Inference on Kubernetes with GPU Autoscaling
dbaplus Community
dbaplus Community
Dec 22, 2025 · Cloud Computing

How We Cut Kubernetes Costs by 40% Without Switching Platforms

By rethinking resource requests, eliminating unused workloads, downsizing node types, fine‑tuning autoscaling, and trimming log storage, a team reduced their Kubernetes bill by 40% while keeping the same cloud provider, demonstrating that most cost overruns stem from misconfiguration rather than the platform itself.

Cost OptimizationKubernetesPrometheus
0 likes · 6 min read
How We Cut Kubernetes Costs by 40% Without Switching Platforms
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Dec 15, 2025 · Artificial Intelligence

Baidu Baige’s Breakthrough: Orchestrating Giant LLM Inference with Silent Instances

The article details Baidu Baige’s next‑generation distributed inference platform for trillion‑parameter LLMs, explaining how automated orchestration, the FedDeployment abstraction, SplitService unified view, Adaptive HPA predictive scaling, Silent Instances for second‑level activation, and the Staggered Batched Scheduler eliminate scaling limits, reduce TTFT by 30‑40%, boost throughput by up to 20%, and achieve cost‑effective, elastic AI compute.

Distributed inferenceKubernetesLLM
0 likes · 23 min read
Baidu Baige’s Breakthrough: Orchestrating Giant LLM Inference with Silent Instances
Ray's Galactic Tech
Ray's Galactic Tech
Nov 21, 2025 · Cloud Native

Mastering Kubernetes HPA: How It Works, Real‑World Setup, and Troubleshooting

Horizontal Pod Autoscaler (HPA) in Kubernetes automatically scales pod replicas based on metrics like CPU, memory, or custom indicators, and this guide explains its core principles, configuration pitfalls, step‑by‑step troubleshooting commands, and advanced considerations such as API versions, stabilization windows, and integration with Cluster Autoscaler.

HPAKubernetesautoscaling
0 likes · 9 min read
Mastering Kubernetes HPA: How It Works, Real‑World Setup, and Troubleshooting
MaGe Linux Operations
MaGe Linux Operations
Nov 6, 2025 · Cloud Native

Master Kubernetes Node Autoscaling with Custom Prometheus Metrics in 30 Minutes

This guide walks you through a complete, 30‑minute implementation of Kubernetes node autoscaling using Horizontal Pod Autoscaler (HPA) with custom Prometheus metrics, covering prerequisites, anti‑pattern warnings, environment matrix, step‑by‑step deployment, core principles, observability, troubleshooting, best practices, and FAQ.

HPAKubernetesPrometheus
0 likes · 50 min read
Master Kubernetes Node Autoscaling with Custom Prometheus Metrics in 30 Minutes
IT Architects Alliance
IT Architects Alliance
Oct 19, 2025 · Cloud Native

Mastering Cloud‑Native Autoscaling: HPA, VPA, CA, and Cost‑Aware Strategies

This article explores the challenges and best practices of cloud‑native scaling, covering Horizontal and Vertical Pod Autoscalers, Cluster Autoscaler cost optimization, event‑driven scaling with KEDA, traffic‑aware scaling in service meshes, and intelligent cost‑aware strategies backed by monitoring and future AI‑driven trends.

Cost OptimizationKubernetesService Mesh
0 likes · 11 min read
Mastering Cloud‑Native Autoscaling: HPA, VPA, CA, and Cost‑Aware Strategies
Ops Community
Ops Community
Oct 8, 2025 · Cloud Native

How I Cut My Kubernetes Cloud Bill by 60% in 3 Months – Proven Strategies

Facing a 35‑million‑yuan monthly Kubernetes bill, the author analyzed hidden cost components, implemented five optimization campaigns—including resource request tuning, autoscaling, spot instances, storage tiering, and network consolidation—and reduced monthly expenses by 60% while boosting performance, delivering a detailed, reproducible methodology.

Cloud NativeCost OptimizationFinOps
0 likes · 33 min read
How I Cut My Kubernetes Cloud Bill by 60% in 3 Months – Proven Strategies
IT Architects Alliance
IT Architects Alliance
Sep 7, 2025 · Cloud Native

Mastering Elastic Scaling on Kubernetes: Cut Costs While Handling Traffic Peaks

This article explains how to design elastic scaling architectures on cloud platforms—combining horizontal, vertical, and functional scaling, leveraging Kubernetes autoscaling features, predictive scaling, mixed instance strategies, and cost‑monitoring practices—to handle traffic spikes while minimizing expenses.

Cloud Cost OptimizationDevOpsautoscaling
0 likes · 9 min read
Mastering Elastic Scaling on Kubernetes: Cut Costs While Handling Traffic Peaks
MaGe Linux Operations
MaGe Linux Operations
Aug 16, 2025 · Cloud Native

Master Container Deployment: Docker & Kubernetes Best Practices for Production

This comprehensive guide walks you through containerizing applications, optimizing Docker images, securing containers, designing Kubernetes high‑availability clusters, implementing observability with Prometheus and ELK, automating CI/CD pipelines, applying RBAC and network policies, and cutting costs with autoscaling and resource tuning, all backed by real‑world code examples.

DockerKubernetesautoscaling
0 likes · 20 min read
Master Container Deployment: Docker & Kubernetes Best Practices for Production
Cloud Native Technology Community
Cloud Native Technology Community
Jul 31, 2025 · Cloud Native

Cut Kubernetes Costs by 30%: Six Proven Automation Strategies

An analysis of recent Kubernetes cost benchmarks reveals chronic over‑provisioning, with up to 40% idle CPU and 57% idle memory, and offers six community‑validated, actionable automation techniques—including flexible instance selection, arm migration, custom autoscaling, bin‑packing, VPA, and safe Spot usage—to dramatically reduce cloud spend.

Cost OptimizationKubernetesautoscaling
0 likes · 8 min read
Cut Kubernetes Costs by 30%: Six Proven Automation Strategies
Efficient Ops
Efficient Ops
Oct 9, 2024 · Cloud Computing

How One Engineer Runs a Full SaaS on Kubernetes with Minimal Effort

This article details how a solo engineer built and operated a SaaS platform on AWS using Kubernetes, covering infrastructure overview, automatic DNS, TLS, load balancing, CI/CD rollouts, autoscaling, caching, secret management, monitoring, logging, error tracking, and cost‑effective operations.

AWSInfrastructure as CodeKubernetes
0 likes · 21 min read
How One Engineer Runs a Full SaaS on Kubernetes with Minimal Effort
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Sep 5, 2024 · Artificial Intelligence

Deploying NVIDIA NIM on Alibaba Cloud ACK with Cloud‑Native AI Suite: A Step‑by‑Step Guide

This guide explains how to quickly build a high‑performance, observable, and elastically scalable LLM inference service by deploying NVIDIA NIM on an Alibaba Cloud ACK cluster using the Cloud‑Native AI Suite, KServe, Prometheus, Grafana, and custom autoscaling based on request‑queue metrics.

Alibaba Cloud ACKGrafanaKServe
0 likes · 15 min read
Deploying NVIDIA NIM on Alibaba Cloud ACK with Cloud‑Native AI Suite: A Step‑by‑Step Guide
Liangxu Linux
Liangxu Linux
Jul 28, 2024 · Cloud Native

Avoid These 10 Common Kubernetes Mistakes to Boost Reliability

This article outlines the most frequent Kubernetes pitfalls—such as missing resource requests, omitted health checks, using the :latest tag, over‑privileged containers, insufficient monitoring, default namespace misuse, weak security settings, absent PodDisruptionBudgets, lack of pod anti‑affinity, and improper load‑balancing—and provides concrete commands, YAML examples, and best‑practice recommendations to prevent them.

KubernetesResource ManagementSecurity
0 likes · 13 min read
Avoid These 10 Common Kubernetes Mistakes to Boost Reliability
dbaplus Community
dbaplus Community
Jul 2, 2024 · Cloud Native

How Xiaohongshu Cut Kafka Storage Costs by 60% with a Cloud‑Native Tiered Architecture

Facing exploding Kafka scale, Xiaohongshu’s data‑storage team adopted a cloud‑native design that introduces tiered hot‑cold storage, containerization, and a custom load‑balancing service, achieving dramatic storage‑cost reductions, minute‑level cluster migrations, high‑performance data access, and automated resource scheduling.

autoscalingcloud-nativecontainerization
0 likes · 20 min read
How Xiaohongshu Cut Kafka Storage Costs by 60% with a Cloud‑Native Tiered Architecture
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
May 31, 2024 · Cloud Native

Best Practices for Deploying AI Model Inference on Knative

This guide explains how to efficiently deploy AI model inference services on Knative by externalizing model data, using Fluid for accelerated loading, configuring secrets, ImageCache, graceful shutdown, probes, autoscaling parameters, mixed ECS/ECI resources, shared GPU scheduling, and observability features to achieve fast scaling, low cost, and high elasticity.

AI Model InferenceCloud NativeGPU
0 likes · 19 min read
Best Practices for Deploying AI Model Inference on Knative
DeWu Technology
DeWu Technology
Dec 27, 2023 · Cloud Native

DeWu's Cloud-Native Container Management Practices

Since August 2021, DeWu App has built a cloud‑native, multi‑cluster Kubernetes platform that uses an OAM‑style CloneSet model, Helm‑generated resources, Karmada‑based federation, custom scheduler plugins for reservation and node‑balance, offline mixing for Flink, a unified KubeAutoScaler, and a self‑built KubeAI stack, achieving significant cost cuts and improved stability while planning further middleware containerization and multi‑cloud expansion.

AICost ManagementKubernetes
0 likes · 22 min read
DeWu's Cloud-Native Container Management Practices
Alibaba Cloud Native
Alibaba Cloud Native
Dec 13, 2023 · Cloud Native

Mastering Traffic Management in Knative: Blue‑Green Deployments, Autoscaling, and Monitoring

This article explains how Knative leverages request‑driven traffic management to simplify blue‑green releases, configure multi‑gateway ingress, apply revision garbage‑collection policies, enable custom domains, support multiple protocols, and provide automatic scaling and observability through Prometheus and Grafana.

Blue‑Green deploymentKnativeautoscaling
0 likes · 15 min read
Mastering Traffic Management in Knative: Blue‑Green Deployments, Autoscaling, and Monitoring
Alibaba Cloud Native
Alibaba Cloud Native
Sep 3, 2023 · Cloud Native

Master Knative’s Request‑Based Autoscaling: KPA, Scale‑to‑Zero, and Advanced Strategies

This article explains how Knative implements request‑based autoscaling with KPA, details the scale‑to‑zero mechanism, shows how to handle burst traffic using stable and panic windows, and demonstrates advanced extensions such as resource pools, precise MPA scaling, and predictive AHPA configurations with concrete YAML examples.

Cloud NativeKPAKnative
0 likes · 18 min read
Master Knative’s Request‑Based Autoscaling: KPA, Scale‑to‑Zero, and Advanced Strategies
MaGe Linux Operations
MaGe Linux Operations
Aug 31, 2023 · Cloud Native

How to Achieve Zero‑Downtime Deployments with Kubernetes

Learn how to configure Kubernetes for zero‑downtime applications by syncing container images, ensuring multiple pod replicas, using PodDisruptionBudgets, selecting appropriate deployment strategies, setting up liveness/readiness probes, handling graceful termination, applying pod anti‑affinity, and enabling autoscaling and proper resource limits.

KubernetesProbesZero Downtime
0 likes · 12 min read
How to Achieve Zero‑Downtime Deployments with Kubernetes
DevOps Cloud Academy
DevOps Cloud Academy
Aug 29, 2023 · Cloud Native

Achieving Zero‑Downtime Applications with Kubernetes

This article explains why and how to use Kubernetes features such as multiple pod replicas, PodDisruptionBudgets, deployment strategies, health probes, graceful termination, anti‑affinity, resource limits, and autoscaling to build zero‑downtime, highly available applications.

Deployment StrategiesHealth probesKubernetes
0 likes · 12 min read
Achieving Zero‑Downtime Applications with Kubernetes
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Aug 29, 2023 · Cloud Computing

MagicScaler: Achieving High QoS and Low Cost with Uncertainty‑Aware Autoscaling

The MagicScaler framework, introduced by Alibaba Cloud’s big‑data engineering team and collaborators, combines a multi‑scale attention Gaussian process predictor with an uncertainty‑aware elastic scaling decision engine, delivering significantly higher quality‑of‑service and lower operational costs than traditional autoscaling methods, as demonstrated on real MaxCompute workloads.

Gaussian ProcessPredictive ModelingResource Management
0 likes · 7 min read
MagicScaler: Achieving High QoS and Low Cost with Uncertainty‑Aware Autoscaling
AntTech
AntTech
Jul 14, 2023 · Cloud Native

KapacityStack: Open‑Source Cloud‑Native Intelligent Capacity Management and IHPA

KapacityStack is an open‑source, cloud‑native capacity platform from Ant Group that introduces the Intelligent Horizontal Pod Autoscaler (IHPA) to provide predictive, multi‑level, and stable autoscaling, reducing resource waste, carbon emissions, and operational costs while supporting extensible, modular integration with Kubernetes workloads.

autoscalingcapacity managementcloud-native
0 likes · 11 min read
KapacityStack: Open‑Source Cloud‑Native Intelligent Capacity Management and IHPA
Java Architect Essentials
Java Architect Essentials
Jun 13, 2023 · Cloud Native

Zero‑Downtime Deployment with K8s and SpringBoot: Health Checks, Rolling Updates, Graceful Shutdown, Autoscaling, Prometheus Integration, and Config Separation

This article demonstrates how to achieve zero‑downtime releases for SpringBoot applications on Kubernetes by configuring readiness/liveness probes, rolling update strategies, graceful shutdown hooks, horizontal pod autoscaling, Prometheus monitoring, and externalized configuration via ConfigMaps.

ConfigMapHealthcheckKubernetes
0 likes · 13 min read
Zero‑Downtime Deployment with K8s and SpringBoot: Health Checks, Rolling Updates, Graceful Shutdown, Autoscaling, Prometheus Integration, and Config Separation
Programmer DD
Programmer DD
May 23, 2023 · Cloud Native

Achieve Zero‑Downtime Deployments with K8s and Spring Boot: Health Checks, Rolling Updates, and Autoscaling

This guide explains how to combine Kubernetes and Spring Boot to implement zero‑downtime releases by configuring readiness and liveness probes, defining graceful shutdown, applying rolling update strategies, setting up horizontal pod autoscaling, integrating Prometheus monitoring, and separating configuration via ConfigMaps for reusable images.

PrometheusRolling UpdateSpring Boot
0 likes · 13 min read
Achieve Zero‑Downtime Deployments with K8s and Spring Boot: Health Checks, Rolling Updates, and Autoscaling
Open Source Linux
Open Source Linux
May 11, 2023 · Cloud Native

When Kubernetes CPU Limits Fail: Better Alternatives and Best Practices

This article explains how Kubernetes CPU requests and limits work, why limits can throttle performance, compares language‑specific behaviors, and presents alternative strategies such as relying on requests with Horizontal Pod Autoscaling for more efficient and cost‑effective scaling.

Kubernetesautoscalingcontainer orchestration
0 likes · 12 min read
When Kubernetes CPU Limits Fail: Better Alternatives and Best Practices
MaGe Linux Operations
MaGe Linux Operations
Apr 30, 2023 · Cloud Native

Master Kubernetes Essentials: kube-proxy, Pods, Deployments, Services & More

This article explains key Kubernetes concepts—including the kube-proxy IPVS mode versus iptables, static Pods, Pod lifecycle states, Pod creation workflow, restart policies, health probes, scheduling strategies, init containers, Deployment upgrade processes and strategies, DaemonSet characteristics, Horizontal Pod Autoscaling, Service types and load‑balancing, headless Services, external access methods, Ingress routing, image pull policies, and load‑balancer options—providing a comprehensive overview for cloud‑native practitioners.

Cloud NativeDeploymentIngress
0 likes · 16 min read
Master Kubernetes Essentials: kube-proxy, Pods, Deployments, Services & More
Cloud Native Technology Community
Cloud Native Technology Community
Feb 7, 2023 · Cloud Native

Machine Learning‑Based Optimization of Kubernetes Resources

This article explains how machine learning can be applied to automatically optimize CPU and memory settings in Kubernetes clusters, covering both experiment‑driven and observation‑driven approaches, step‑by‑step procedures, best‑practice recommendations, and the benefits of combining both methods for efficient, scalable cloud‑native operations.

KubernetesResource Optimizationautoscaling
0 likes · 11 min read
Machine Learning‑Based Optimization of Kubernetes Resources
ITPUB
ITPUB
Dec 26, 2022 · Cloud Native

What Really Happens When You Deploy an App on Kubernetes?

This article walks through the complete lifecycle of a Kubernetes deployment, explaining how a manual upgrade request triggers API calls, creates Deployments, ReplicaSets, Pods, and how the scheduler, kubelet, and Docker work together, while also covering concepts like containers, labels, replication controllers, deployments, and autoscaling mechanisms.

Cloud NativeContainersDeployment
0 likes · 23 min read
What Really Happens When You Deploy an App on Kubernetes?
HelloTech
HelloTech
Dec 23, 2022 · Cloud Native

Design Principles and Implementation Details of Kubernetes Horizontal Pod Autoscaler and Custom Water Pod Autoscaler

The article explains Kubernetes’ built‑in Horizontal Pod Autoscaler, then details the custom Water Pod Autoscaler (WPA) that extends HPA with dual‑signal (load and SOA registration) detection, dual‑threshold scaling, noise filtering, configurable cooldown, frequency limits, tolerance buffers, and integrated alerting for reliable elastic scaling.

Cloud NativeHPAKubernetes
0 likes · 13 min read
Design Principles and Implementation Details of Kubernetes Horizontal Pod Autoscaler and Custom Water Pod Autoscaler
Tencent Cloud Developer
Tencent Cloud Developer
Nov 24, 2022 · Cloud Native

Large‑Scale Cost Optimization for Kubernetes/TKE: Data Collection, Measures, and Implementation

The article details a Tencent‑led, end‑to‑end cost‑optimization project for large‑scale Kubernetes/TKE clusters that collected extensive workload metrics, applied VPA/HPA enhancements, custom scheduling and node‑downscaling via the open‑source Crane platform, ultimately delivering up to 70% CPU and 50% memory savings with zero‑fault deployments.

HPAKubernetesResource Management
0 likes · 29 min read
Large‑Scale Cost Optimization for Kubernetes/TKE: Data Collection, Measures, and Implementation
AntTech
AntTech
Nov 10, 2022 · Cloud Computing

DeepScaling: An Automated Capacity Evaluation System for Stable CPU Utilization in Large‑Scale Cloud Services

DeepScaling is a deep‑learning‑driven autoscaling framework that predicts workload, estimates CPU usage, and makes reinforcement‑learning‑based scaling decisions to keep microservice CPU utilization at a target level, thereby reducing resource waste while meeting SLOs in large‑scale cloud environments.

Resource Managementautoscalingcloud computing
0 likes · 21 min read
DeepScaling: An Automated Capacity Evaluation System for Stable CPU Utilization in Large‑Scale Cloud Services
Efficient Ops
Efficient Ops
Nov 2, 2022 · Cloud Native

Why Your HPA Isn’t Scaling: 3 Common Misconceptions and How to Fix Them

This article explains three frequent misunderstandings about Kubernetes Horizontal Pod Autoscaler—dead zones, misuse of utilization calculations, and perceived lag in scaling—while detailing HPA’s inner workings, metric sources, calculation methods, and behavior configuration to help you avoid scaling pitfalls.

HPAKubernetesautoscaling
0 likes · 12 min read
Why Your HPA Isn’t Scaling: 3 Common Misconceptions and How to Fix Them
Huolala Tech
Huolala Tech
Oct 20, 2022 · Cloud Native

How Huolala Cuts Cloud Costs with Kubernetes: Spot Instances, Smart Autoscaling, and Predictive Scaling

This presentation details Huolala's end‑to‑end cloud‑native cost‑optimization strategy, covering the company's infrastructure basics, Kubernetes‑based server cost‑saving techniques, a tailored optimization roadmap, practical Spot Instance usage, and a custom CronHPA‑driven scheduled scaling solution to boost resource utilization.

Cloud NativeCost OptimizationHPA
0 likes · 23 min read
How Huolala Cuts Cloud Costs with Kubernetes: Spot Instances, Smart Autoscaling, and Predictive Scaling
Alibaba Cloud Native
Alibaba Cloud Native
Oct 3, 2022 · Cloud Native

Can AHPA Predict Kubernetes Scaling Before Load Spikes?

This article introduces the Advanced Horizontal Pod Autoscaler (AHPA), explains its three‑stage architecture of data collection, prediction, and scaling, details the RobustScaler forecasting algorithm and CRD‑based deployment, and evaluates its ability to proactively and reactively adjust pod counts with high robustness.

CRDCloud NativeKubernetes
0 likes · 13 min read
Can AHPA Predict Kubernetes Scaling Before Load Spikes?
Practical DevOps Architecture
Practical DevOps Architecture
Sep 20, 2022 · Cloud Native

Kubernetes Advantages, Use Cases, Features, Drawbacks, and Core Concepts

This article outlines Kubernetes' main advantages such as container orchestration, lightweight design, open‑source nature, elastic scaling and load balancing, describes typical deployment scenarios, highlights its portability, extensibility and automation, lists current drawbacks, and explains fundamental components like master, node, pod, labels, controllers, services, volumes, and namespaces.

autoscaling
0 likes · 5 min read
Kubernetes Advantages, Use Cases, Features, Drawbacks, and Core Concepts
AntTech
AntTech
Jun 22, 2022 · Cloud Computing

Meta Reinforcement Learning Framework for Predictive Autoscaling in Cloud Environments

This article presents a cloud-native, end‑to‑end autoscaling solution that integrates traffic forecasting, CPU utilization meta‑prediction, and a reinforcement‑learning‑based scaling decision module into a fully differentiable system, achieving higher resource utilization and cost efficiency as demonstrated by ACM SIGKDD 2022 research.

Meta LearningPredictive Modelingautoscaling
0 likes · 10 min read
Meta Reinforcement Learning Framework for Predictive Autoscaling in Cloud Environments
DataFunTalk
DataFunTalk
May 21, 2022 · Big Data

Exploring and Implementing Elastic Scheduling for Xiaomi Hadoop YARN

This talk presents Xiaomi's design and deployment of an elastic scheduling system for Hadoop YARN, covering background analysis, resource‑pool strategy, auto‑scaling architecture, stability challenges, label‑based resource isolation, Spark shuffle handling, cost‑saving results and future plans.

Big DataHadoopResource Management
0 likes · 16 min read
Exploring and Implementing Elastic Scheduling for Xiaomi Hadoop YARN
Alibaba Cloud Native
Alibaba Cloud Native
May 5, 2022 · Cloud Native

Achieving Low‑Cost, High‑Elastic Kubernetes Deployments with ACK, ECI, and OpenKruise

This article explains how to use Kubernetes native autoscaling components—HPA, VPA, Cluster Autoscaler—and cloud‑native extensions such as Alibaba Cloud's Virtual Node, Elastic Container Instance, Elastic Workload, and the open‑source OpenKruise to build a cost‑effective, highly elastic architecture on ACK clusters.

Cluster AutoscalerElastic WorkloadHPA
0 likes · 28 min read
Achieving Low‑Cost, High‑Elastic Kubernetes Deployments with ACK, ECI, and OpenKruise
dbaplus Community
dbaplus Community
Apr 6, 2022 · Cloud Native

How Huolala Cuts Cloud Costs: Real‑World Kubernetes Optimization Strategies

Huolala’s architecture team shares a detailed walkthrough of their cloud‑native cost‑optimization journey, covering public‑cloud server pricing models, Kubernetes request/limit tuning, HPA and CronHPA scheduling, spot instance integration, intelligent pod placement, and practical lessons learned from scaling a global on‑demand logistics platform.

Cloud NativeCost OptimizationHuolala
0 likes · 25 min read
How Huolala Cuts Cloud Costs: Real‑World Kubernetes Optimization Strategies
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 17, 2022 · Big Data

How AutoStream Scales Real‑Time Data Processing with Flink, Iceberg, and PyFlink

This article details AutoStream's evolution from a Java‑only Storm platform to a Flink‑based, Kubernetes‑native streaming system that integrates budgeting controls, automatic scaling, lakehouse architecture with Iceberg, and PyFlink support, highlighting the technical challenges, solutions, and future roadmap for real‑time analytics.

FlinkIcebergLakehouse
0 likes · 23 min read
How AutoStream Scales Real‑Time Data Processing with Flink, Iceberg, and PyFlink
HomeTech
HomeTech
Mar 16, 2022 · Cloud Native

Understanding Kubernetes Horizontal Pod Autoscaler (HPA): Mechanism, Core Source Code, and Practical Insights

This article explains how Kubernetes Horizontal Pod Autoscaler (HPA) balances resource demand and workload by automatically scaling pod replicas, describes the different metric types it supports, walks through the core controller code (Run, worker, reconcile, and replica calculation), highlights current limitations, and shares practical observations from real‑world usage.

Horizontal Pod AutoscalerKubernetesMetrics
0 likes · 11 min read
Understanding Kubernetes Horizontal Pod Autoscaler (HPA): Mechanism, Core Source Code, and Practical Insights
Ctrip Technology
Ctrip Technology
Dec 30, 2021 · Cloud Computing

Ctrip’s Practice of Using AWS Spot Instances for Cost Reduction and High Availability

This article details Ctrip’s large‑scale use of AWS Spot instances on Kubernetes, explaining the cost benefits, the challenges of spot interruptions, and the architectural and operational strategies—including multi‑AZ deployment, scheduling policies, autoscaling group design, and observability—that enable a 50% reduction in container costs while maintaining system stability and reliability.

AWS SpotCost OptimizationKubernetes
0 likes · 13 min read
Ctrip’s Practice of Using AWS Spot Instances for Cost Reduction and High Availability
dbaplus Community
dbaplus Community
Jul 19, 2021 · Cloud Native

Avoid These 10 Common Kubernetes Mistakes to Boost Reliability and Cost Efficiency

This article shares a practical guide to the most frequent Kubernetes pitfalls—from misconfigured resource requests and limits to improper liveness/readiness probes, load‑balancer settings, IAM misuse, pod anti‑affinity, and disruption budgets—offering concrete YAML examples and remediation steps to help operators run more reliable and cost‑effective clusters.

Cloud NativeKubernetesProbes
0 likes · 18 min read
Avoid These 10 Common Kubernetes Mistakes to Boost Reliability and Cost Efficiency
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Apr 17, 2021 · Cloud Native

How Knative Handles Cold‑Start Traffic: From Activator to Pod

This article explores Knative’s traffic routing and autoscaling mechanisms, detailing how requests are initially directed through the Activator during cold‑start, how VirtualService configurations evolve, and how newer versions shift traffic handling to Kubernetes Service/Endpoint layers, improving performance and decoupling gateway logic.

IstioKnativeKubernetes
0 likes · 14 min read
How Knative Handles Cold‑Start Traffic: From Activator to Pod
Alibaba Cloud Native
Alibaba Cloud Native
Apr 5, 2021 · Cloud Native

How Knative Enables Traffic‑Based Autoscaling and Gray Deployments

This article explains Knative’s traffic‑driven autoscaling and gray‑release capabilities, detailing the request flow architecture, the roles of Service, Configuration, Route and Revision, and walks through built‑in scaling strategies such as KPA, HPA, scheduled‑HPA, event‑gateway and custom plugins, with practical examples.

Cloud NativeGray DeploymentHPA
0 likes · 10 min read
How Knative Enables Traffic‑Based Autoscaling and Gray Deployments
Liulishuo Tech Team
Liulishuo Tech Team
Feb 4, 2021 · Cloud Computing

Improving Cloud Cost Allocation and Resource Utilization through Catalog, Tags, and Automated Monitoring

This article describes how a tech team built a catalog‑based cost‑allocation system, leveraged cloud tags and Kubernetes labels, used Prometheus data for scaling decisions, and combined reserved, spot, and on‑demand instances to boost cloud resource utilization while keeping services stable.

Cloud Costautoscalingcloud-tagging
0 likes · 8 min read
Improving Cloud Cost Allocation and Resource Utilization through Catalog, Tags, and Automated Monitoring
Open Source Linux
Open Source Linux
Jan 29, 2021 · Operations

Essential Kubernetes Production Best Practices for Secure, Scalable Ops

This article outlines comprehensive production‑grade Kubernetes best practices—including health probes, RBAC, resource management, network policies, monitoring, autoscaling, image security, and zero‑downtime strategies—to help teams run secure, efficient, and highly available workloads.

KubernetesOperationsSecurity
0 likes · 11 min read
Essential Kubernetes Production Best Practices for Secure, Scalable Ops
Java Architect Essentials
Java Architect Essentials
Aug 12, 2020 · Operations

Common Kubernetes Pitfalls and How to Fix Them

This article outlines frequent Kubernetes operational mistakes—such as misconfigured resource requests, missing probes, improper load‑balancer exposure, naïve autoscaling, IAM/RBAC misuse, lack of anti‑affinity, absent PodDisruptionBudgets, multi‑tenant pitfalls, and suboptimal externalTrafficPolicy—providing concrete remediation steps and best‑practice code examples.

KubernetesProbesResource Management
0 likes · 15 min read
Common Kubernetes Pitfalls and How to Fix Them
Efficient Ops
Efficient Ops
Jun 25, 2020 · Cloud Native

How Xiaomi Scaled Redis with Kubernetes: Deploying Redis Cluster on K8s

This article explains how Xiaomi migrated tens of thousands of Redis instances from bare‑metal servers to Kubernetes, using Redis Proxy, StatefulSets, and Ceph storage to achieve resource isolation, automated deployment, dynamic scaling, and improved reliability while addressing latency, IP‑change, and security challenges.

CephKubernetesProxy
0 likes · 20 min read
How Xiaomi Scaled Redis with Kubernetes: Deploying Redis Cluster on K8s
Tencent Cloud Developer
Tencent Cloud Developer
Sep 12, 2019 · Cloud Native

Optimizing Kubernetes Cluster Load: From Static Scheduling to Advanced Resource Management

The article explains Kubernetes’ static scheduler causes fragmented, under‑utilized clusters, then proposes dynamic techniques—pod resource compression, node resource oversell via admission webhooks, and an enhanced per‑HPA autoscaling controller—while outlining future scheduler extensions, monitoring integration with Tencent Cloud, and a senior cloud‑native engineer recruitment call.

KubernetesResource CompressionStatic Scheduling
0 likes · 12 min read
Optimizing Kubernetes Cluster Load: From Static Scheduling to Advanced Resource Management
Alibaba Cloud Native
Alibaba Cloud Native
Sep 4, 2019 · Cloud Native

How Serverless and Autoscaling Transform Kubernetes: Principles, Challenges, and Solutions

This article explains how serverless and autoscaling complement Kubernetes by detailing resource‑capacity curves, stakeholder needs, core autoscaling components, key challenges, design philosophy, classic use cases, limitations of traditional scaling, and the emerging virtual‑kubelet‑autoscaler solution.

Cluster AutoscalerKubernetesServerless
0 likes · 15 min read
How Serverless and Autoscaling Transform Kubernetes: Principles, Challenges, and Solutions
Alibaba Cloud Native
Alibaba Cloud Native
Jul 1, 2019 · Cloud Native

How Alibaba Cloud’s Kubernetes Service Enables Seamless Monitoring and Autoscaling

Alibaba Cloud’s Kubernetes service integrates four native monitoring services—SLS, ARMS, AHAS, and Cloud Monitor—while offering enhanced open‑source components and autoscaling mechanisms such as HPA, VPA, cronHPA, Resizer, Cluster‑Autoscaler, and virtual‑kubelet‑autoscaler, enabling cloud‑native apps to achieve robust observability and elastic scaling.

Alibaba CloudKubernetesautoscaling
0 likes · 10 min read
How Alibaba Cloud’s Kubernetes Service Enables Seamless Monitoring and Autoscaling
Meitu Technology
Meitu Technology
Jan 30, 2019 · Cloud Native

Meitu's Container Platform: Architecture, Network, Load Balancing, Logging, Scheduling, and Autoscaling

Meitu’s container platform, built on Kubernetes with Calico networking, a custom Nginx load‑balancer, unified logging, refined scheduling, autoscaling, and comprehensive monitoring, enables seamless multi‑cluster hybrid‑cloud operations for its hundreds‑of‑millions‑user services while providing CI/CD tooling and future‑ready extensions such as service mesh and edge computing.

Cloud NativeKubernetesScheduling
0 likes · 23 min read
Meitu's Container Platform: Architecture, Network, Load Balancing, Logging, Scheduling, and Autoscaling
Architecture Digest
Architecture Digest
Feb 2, 2018 · Cloud Computing

Design and Implementation of an Elastic Scaling Service on Alibaba ECS

This article explains why elastic scaling is needed for variable web traffic, describes how to build a cost‑effective, automatically adjustable service on Alibaba ECS using message queues, service refactoring, Docker deployment, logging, and a real‑time allocation algorithm, and shares practical lessons learned.

Alibaba ECSAllocation AlgorithmDocker
0 likes · 9 min read
Design and Implementation of an Elastic Scaling Service on Alibaba ECS
Liulishuo Tech Team
Liulishuo Tech Team
Dec 31, 2016 · Cloud Native

Designing Scalable and Reliable Backend Services at English Fluently: Architecture, Service Discovery, Monitoring, and Autoscaling

This article shares the engineering team’s experience of building a high‑growth, reliable backend for English Fluently, covering inter‑service communication with gRPC, service discovery, Docker‑based deployment, health‑checking, monitoring, autoscaling, Kubernetes orchestration, and multi‑cell availability strategies.

DockerKubernetesMicroservices
0 likes · 10 min read
Designing Scalable and Reliable Backend Services at English Fluently: Architecture, Service Discovery, Monitoring, and Autoscaling