Operations 10 min read

Master Kubernetes Capacity Planning: Detect & Optimize Unused Resources

This guide explains Kubernetes capacity planning, showing how to detect idle CPU and memory, identify wasteful namespaces, use open‑source tools like kube‑state‑metrics and cAdvisor, and apply PromQL queries to optimize resource requests and measure the impact of your improvements.

Open Source Linux
Open Source Linux
Open Source Linux
Master Kubernetes Capacity Planning: Detect & Optimize Unused Resources

Kubernetes capacity planning is a major challenge for infrastructure engineers because understanding resource requirements and limits is not easy.

You may over‑provision resources to ensure containers don’t run out of memory or hit CPU limits, which can lead to unnecessary cloud costs and harder scheduling. Balancing cluster stability, reliability, and efficient resource use is why capacity planning matters.

This article shows how to identify unused resources and allocate cluster capacity wisely.

Don’t Be a Greedy Developer

Sometimes containers request more resources than they need. A single container may have little impact, but when many containers over‑request, extra costs appear in large clusters.

Oversized Pods also make scheduling harder.

Two open‑source tools can help with Kubernetes capacity planning:

kube‑state‑metrics – an add‑on exporter that generates and exposes cluster‑level metrics.

cAdvisor – a resource usage analyzer for containers.

Running these tools in your cluster lets you avoid under‑utilization and adjust resource allocations.

How to Detect Under‑Utilized Resources

CPU

CPU usage is one of the hardest thresholds to tune; too low limits service compute power, too high leaves nodes idle.

Detect Idle CPU

Using the metrics

container_cpu_usage_seconds_total

and

kube_pod_container_resource_requests

you can see core utilization.

sum((rate(container_cpu_usage_seconds_total{container!="POD",container!=""}[30m]) - on (namespace,pod,container) group_left avg by (namespace,pod,container)(kube_pod_container_resource_requests{resource="cpu"})) * -1 > 0)

Identify Namespaces Wasting CPU

Aggregating past queries by namespace gives finer‑grained insight, allowing you to hold teams accountable for over‑provisioned workloads.

sum by (namespace)((rate(container_cpu_usage_seconds_total{container!="POD",container!=""}[30m]) - on (namespace,pod,container) group_left avg by (namespace,pod,container)(kube_pod_container_resource_requests{resource="cpu"})) * -1 > 0)

Top 10 CPU‑Hungry Containers

Use the

topk

function to list the containers with the highest CPU waste.

topk(10,sum by (namespace,pod,container)((rate(container_cpu_usage_seconds_total{container!="POD",container!=""}[30m]) - on (namespace,pod,container) group_left avg by (namespace,pod,container)(kube_pod_container_resource_requests{resource="cpu"})) * -1 > 0))

Memory

Proper memory planning is crucial; high usage can trigger OOM eviction, while over‑provisioning reduces the number of Pods per node.

Detect Unused Memory

Metrics

container_memory_usage_bytes

and

kube_pod_container_resource_requests

reveal wasted memory.

sum((container_memory_usage_bytes{container!="POD",container!=""} - on (namespace,pod,container) avg by (namespace,pod,container)(kube_pod_container_resource_requests{resource="memory"})) * -1 > 0) / (1024*1024*1024)

The example saves about 0.8 GB for the cluster.

Identify Namespaces Wasting Memory

Aggregate by namespace similarly to CPU.

sum by (namespace)((container_memory_usage_bytes{container!="POD",container!=""} - on (namespace,pod,container) avg by (namespace,pod,container)(kube_pod_container_resource_requests{resource="memory"})) * -1 > 0) / (1024*1024*1024)

Top 10 Memory‑Heavy Containers

Again,

topk

highlights the containers that waste the most memory in each namespace.

topk(10,sum by (namespace,pod,container)((container_memory_usage_bytes{container!="POD",container!=""} - on (namespace,pod,container) avg by (namespace,pod,container)(kube_pod_container_resource_requests{resource="memory"})) * -1 > 0) / (1024*1024*1024))

Optimizing Container Resource Utilization

To keep enough compute capacity, analyze current usage. The following PromQL query calculates the average CPU utilization of all containers belonging to the same workload (Deployment, StatefulSet, or DaemonSet).

avg by (namespace,owner_name,container)((rate(container_cpu_usage_seconds_total{container!="POD",container!=""}[5m])) * on(namespace,pod) group_left(owner_name) avg by (namespace,pod,owner_name)(kube_pod_owner{owner_kind=~"DaemonSet|StatefulSet|Deployment"}))

Based on experience, set container requests to 85 %–115 % of the average CPU or memory usage.

Measuring the Impact of Optimization

After capacity‑planning actions, compare unused CPU cores now versus a week ago to assess the effect.

sum((rate(container_cpu_usage_seconds_total{container!="POD",container!=""}[30m]) - on (namespace,pod,container) group_left avg by (namespace,pod,container)(kube_pod_container_resource_requests{resource="cpu"})) * -1 > 0) - sum((rate(container_cpu_usage_seconds_total{container!="POD",container!=""}[30m] offset 1w) - on (namespace,pod,container) group_left avg by (namespace,pod,container)(kube_pod_container_resource_requests{resource="cpu"} offset 1w)) * -1 > 0)

The chart shows fewer unused CPU cores after optimization.

Conclusion

You now understand the consequences of over‑provisioning, how to detect excessive resource allocation, set appropriate container requests, and measure the impact of your optimizations.

These techniques provide a solid foundation for building a comprehensive Kubernetes capacity‑planning dashboard.

monitoringKubernetesresource optimizationcapacity planningPromQL
Open Source Linux
Written by

Open Source Linux

Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.