Cloud Native 10 min read

KubeCost: Kubernetes-Based Resource Cost Analysis and Allocation System

KubeCost, developed by NetEase Cloud Music, is a low‑intrusion, scalable Kubernetes cost analysis system that allocates resource expenses using peak‑or‑usage billing models, supports hybrid‑multi‑cloud pricing, aggregates per‑pod CPU/memory/GPU costs, and stores data efficiently in ClickHouse for reliable, business‑oriented financial insight.

NetEase Cloud Music Tech Team
NetEase Cloud Music Tech Team
NetEase Cloud Music Tech Team
KubeCost: Kubernetes-Based Resource Cost Analysis and Allocation System

This article introduces KubeCost, a Kubernetes-based resource cost analysis tool developed by NetEase Cloud Music to address IT cost management challenges in the cloud-native era.

Background and Challenges:

Many internet companies have entered a stable development phase where cost control has become critical. IT costs typically account for 1/3 of total operational costs (technology to human resource ratio is approximately 1:2 to 1:2.5). With the adoption of Kubernetes, containers, and DevOps practices, resource management has become more complex. NetEase Cloud Music achieved 50%+ peak resource utilization through containerization, oversubscription, unified scheduling, and hybrid cloud deployment, saving tens of millions annually. However, challenges remain: resource growth continues rapidly with easy DevOps access, and the "big ledger problem" makes it difficult to allocate costs to business lines and evaluate ROI.

Key Challenges Identified:

Decentralization: Traditional centralized financial budgeting is shifting to business-oriented distributed decision-making

Dynamic Changes: Cloud environments and elastic capabilities cause costs to vary with business load

Excess Waste: Easy access to resources often leads to over-provisioning

KubeCost Features:

Multiple Billing Models: Supports annual reserved and pay-as-you-go pricing. For reserved resources, costs are allocated based on peak usage; for spot/low-utilization periods, actual usage-based allocation is applied.

Hybrid/Multi-Cloud Support: Handles different pricing models across internal resources and public clouds (Aliyun, AWS).

Billing Model: Follows OpenCost specification standard. Core principle: allocate = Max(Usage, request). Base billing unit is 10 minutes, aligned with wall-clock time for stability.

Supported Resource Types: CPU, Memory, GPU, and more. Costs are calculated per POD by aggregating individual resource costs (CPU, memory, etc.).

Rich Filtering and Aggregation: Supports label-based filtering and aggregation by Namespace, Cluster, and POD labels.

Architecture Design Principles:

Low Intrusion: Uses sidecar-less, metrics-based collection approach

Reliability: 3+ replica deployment for ApiServer/etcd; Prometheus with dual backup; node failure has minimal impact

Scalability: Supports 100k+ PODs; uses ClickHouse for storage (~20GB/month for 120k PODs at 10min intervals)

Extensibility: Plugin-based billing logic for flexibility

Data Model:

Uses ClickHouse ReplacingMergeTree for efficient storage and fast retry capabilities:

CREATE TABLE IF NOT EXISTS kubecost.kube_billing_infos<br/>(<br/>    create_time        Int64 COMMENT 'record create time',<br/>    start_time         Int64 COMMENT 'billing start time',<br/>    end_time           Int64 COMMENT 'billing end time',<br/>    item               String COMMENT 'billing item, example: cpu, mem, gpu, etc',<br/>    cost               Float64 COMMENT 'billing cost',<br/>    currency           String COMMENT 'billing currency',<br/>    entity_primary_key String COMMENT 'entity primary key, cluster/namespace/pod/container',<br/>    usage_info Map(String, Float64) COMMENT 'etc:usage,request,allocate',<br/>    label_info Map(String, String) COMMENT 'basic labels',<br/>    price_info         String COMMENT 'cost price info'<br/>) Engine = ReplacingMergeTree(create_time)<br/>      PARTITION BY toYYYYMM(FROM_UNIXTIME(start_time))<br/>      ORDER BY (start_time, end_time, item, entity_primary_key)
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Cloud NativeKubernetesCloud Cost ManagementCost OptimizationFinOpsClickHouseresource allocationMulti-Cloud BillingKubeCostOpenCost
NetEase Cloud Music Tech Team
Written by

NetEase Cloud Music Tech Team

Official account of NetEase Cloud Music Tech Team

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.