Cloud Native 18 min read

Unlocking 800% Node Overselling: Xingdou Cloud’s Smart Resource Strategies

This article details how Xingdou Cloud leverages cloud‑native techniques such as massive node overselling, custom HPA (SophonHPA), priority‑based QoS, intelligent cleanup, and quota management to achieve dramatic cost reduction and efficiency gains across its multi‑cloud platform.

Xingsheng Youxuan Technology Community

Aug 18, 2022

Unlocking 800% Node Overselling: Xingdou Cloud’s Smart Resource Strategies

1. Introduction

In recent years, cloud‑native technologies have rapidly developed and been widely adopted. Xingdou Cloud's team embraced this change, building a multi‑cloud platform (Xingdou Cloud) and achieving significant cost reduction and efficiency gains. This article presents the challenges encountered and the elegant solutions applied.

2. Background

As containerization deepened, multiple clusters were merged, leading to very large clusters and increasingly complex issues. Representative problems include:

Reusing retired or idle machines through cloud‑native management.

Ensuring stability for stateful components (Ceph, Kafka, Elasticsearch, HBase, RocketMQ, Redis) when old machines fail.

Controlling resource consumption.

Maximizing resource utilization.

Cleaning up idle images and persistent volumes.

We will discuss the solutions for these problems.

3. Practice

3.1 Node Overselling

Node overselling is a common resource‑mining technique. By overselling, each node can run more services, increasing utilization. On a 160‑core, 1 TB Dell host, we achieved an 800 % oversell rate.

3.1.1 Oversell Scheme

We use a mutating webhook to intercept kubelet node‑resource reports, rewrite them, and store the modified values in etcd.

Kubelet reports node resources to the apiserver.

Apiserver authenticates and authorizes the request.

During the mutating phase, our webhook rewrites the resource information.

The modified resources are persisted in etcd, enabling the scheduler to consider the oversold values.

3.1.2 Oversell Modes

Two modes are supported:

Fixed ratio : Memory is not oversold; CPU is oversold based on a mem/cpu ratio (e.g., ratio 4 → 1 TB, 250 CPU).

Specified coefficient : Both CPU and memory are multiplied by user‑defined coefficients (e.g., CPU × 2.0, memory × 1.2).

3.2 Custom HPA (SophonHPA)

The native HPA is powerful but has limitations in large‑scale burst scenarios. Issues include:

Latency of the 15‑second control loop.

Lack of concurrency when processing many HPA objects.

Metric lag due to Prometheus scraping.

Event‑driven scaling is missing.

No support for zero‑replica workloads.

We designed SophonHPA, an operator with a custom CRD. Its architecture separates the controller (reconcile logic) from the executor (per‑CR pod that fetches metrics and performs scaling). The executor pulls real‑time metrics from Prometheus or other sources, evaluates policies, and triggers scaling with cooldown windows.

3.3 Priority Management

Workloads are classified into six priority levels (P0‑P5). Each level maps to a QoS class (Guaranteed, Burstable, BestEffort) with specific CPU/memory request‑limit settings. This ensures core services receive guaranteed resources while lower‑priority workloads are more elastic.

3.4 Strengthening Core Services

We protect critical nodes with taints and tolerations, enforce node affinity for important services, disperse stateful components using anti‑affinity, and enable preemption to guarantee high‑priority pods during resource contention.

3.5 Offline Mixed‑Placement

To improve resource utilization during off‑peak hours, we combine offline jobs with online services. We introduced tiered management, CronHPA for scheduled scaling, and a dedicated offline scheduling system that can dynamically adjust jobs based on node health and load.

3.6 Intelligent Cleanup

For image repositories, we implement precise cleanup policies that consider usage and dependencies, avoiding accidental removal of active images. For persistent volumes, we safely reclaim unused PVs while ensuring data integrity.

3.7 Quota Management

We track resource consumption in “core‑hour” units, generate hourly reports, and compare usage against quotas. Alerts are sent at 60 %, 80 %, 90 % and 100 % thresholds via email, WeChat, SMS, and phone. Elastic borrowing allows lower‑level services to borrow quota from higher‑level parents when needed.

4. Outlook

Future work includes deeper performance mining using eBPF, NUMA‑aware scheduling, and integrating big‑data and AI workloads with cloud‑native stacks to further reduce costs and improve efficiency.

5. Summary

Through extensive cloud‑native practices—node overselling, custom autoscaling, priority‑based QoS, intelligent cleanup, and quota management—Xingdou Cloud has achieved significant cost savings and operational efficiency, providing a reference for the industry.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

cloud-native Resource Management autoscaling cost-optimization

Written by

Xingsheng Youxuan Technology Community

Xingsheng Youxuan Technology Official Account

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.