Cloud Native 11 min read

Mastering Kubernetes Pod Resource Requests, Limits, and QoS

This guide explains how to configure CPU and memory requests and limits for Kubernetes pods, implement QoS classes, use LimitRange and ResourceQuota, and monitor resource usage with Prometheus queries and Grafana dashboards to ensure stable cluster operations.

Ops Development Stories
Ops Development Stories
Ops Development Stories
Mastering Kubernetes Pod Resource Requests, Limits, and QoS

1. Overview

Pod CPU Request and Memory Request are critical parameters. If they are omitted, Kubernetes assumes minimal resource needs and may schedule the pod on any node, which can lead to resource starvation when the cluster is under pressure. In such cases the node may evict pods, but critical pods (e.g., those handling data storage, login, or balance queries) must be protected.

Enforce resource quotas so different pods can only consume allocated resources.

Allow over‑provisioning to improve cluster utilization.

Assign QoS classes to pods; low‑priority pods are evicted first when resources are scarce.

Kubernetes nodes provide compute resources (CPU, GPU, Memory). This article focuses on CPU and Memory, as most workloads do not need GPU.

CPU and Memory are specified per container via resources.requests and resources.limits. The scheduler uses the request values to find a node with sufficient capacity.

2. Pod Resource Usage Guidelines

Pod CPU and Memory usage is dynamic and depends on load; it is expressed as a range (e.g., 0.1‑1 CPU, 500 Mi‑1 Gi memory). Two key concepts:

Requests – reserved resources required for normal operation.

Limits – maximum resources a pod may consume; for CPU this is a compressible ceiling, for Memory it is a hard limit.

If a pod exceeds its Memory limit, it is terminated by the kubelet. Therefore, Requests and Limits must be set carefully based on actual workload needs.

Example: a pod with a 1 Gi Memory request is scheduled on a node with 1.2 Gi free. After three days the pod needs 1.5 Gi, but the node only has 200 Mi left, so the pod is killed.

Pods without Limits (or with only one of CPU/Memory limits) appear flexible but are less stable than pods with all four parameters set.

When managing hundreds of pods, manually setting Requests and Limits for each is impractical. Kubernetes provides LimitRange (default values and validation) and ResourceQuota (tenant‑level caps) to automate this.

CPU Rules

Unit: millicores (m), where 10 m = 0.01 core, 1 core = 1000 m.

Requests: estimated based on actual usage.

Limits: Requests * 1.2 (i.e., Requests + 20%).

Memory Rules

Unit: Mi, where 1024 Mi = 1 Gi.

Requests: estimated based on actual usage.

Limits: Requests * 1.2.

3. Namespace Resource Management

Overall Requests and Limits should not exceed 80 % of cluster capacity to leave headroom for rolling updates.

3.1 Multi‑Tenant Resource Strategy

Use ResourceQuota to limit resource consumption per project/team.

ResourceQuota diagram
ResourceQuota diagram

3.2 Resource Change Process

Resource change workflow
Resource change workflow

4. Resource Monitoring and Inspection

4.1 Resource Usage Monitoring

Namespace Requests usage rate

sum (kube_resourcequota{type="used",resource="requests.cpu"}) by (resource,namespace) / sum (kube_resourcequota{type="hard",resource="requests.cpu"}) by (resource,namespace) * 100

sum (kube_resourcequota{type="used",resource="requests.memory"}) by (resource,namespace) / sum (kube_resourcequota{type="hard",resource="requests.memory"}) by (resource,namespace) * 100

Namespace Limits usage rate

sum (kube_resourcequota{type="used",resource="limits.cpu"}) by (resource,namespace) / sum (kube_resourcequota{type="hard",resource="limits.cpu"}) by (resource,namespace) * 100

sum (kube_resourcequota{type="used",resource="limits.memory"}) by (resource,namespace) / sum (kube_resourcequota{type="hard",resource="limits.memory"}) by (resource,namespace) * 100

4.2 Viewing via Grafana

Grafana dashboard
Grafana dashboard

CPU request rate

sum (kube_resourcequota{type="used",resource="requests.cpu",namespace=~"$NameSpace"}) by (resource,namespace) / sum (kube_resourcequota{type="hard",resource="requests.cpu",namespace=~"$NameSpace"}) by (resource,namespace)

Memory request rate

sum (kube_resourcequota{type="used",resource="requests.memory",namespace=~"$NameSpace"}) by (resource,namespace) / sum (kube_resourcequota{type="hard",resource="requests.memory",namespace=~"$NameSpace"}) by (resource,namespace)

CPU limit rate

sum (kube_resourcequota{type="used",resource="limits.cpu"}) by (resource,namespace) / sum (kube_resourcequota{type="hard",resource="limits.cpu"}) by (resource,namespace)

Memory limit rate

sum (kube_resourcequota{type="used",resource="limits.memory"}) by (resource,namespace) / sum (kube_resourcequota{type="hard",resource="limits.memory"}) by (resource,namespace)

4.3 In‑Cluster Resource Inspection

Check resource usage

[root@k8s-dev-slave04 yaml]# kubectl describe resourcequotas -n cloudchain--staging

Name:            mem-cpu-demo
Namespace:       cloudchain--staging
Resource         Used   Hard
--------         ----   ----
limits.cpu       200m   500m
limits.memory    200Mi  500Mi
requests.cpu     150m   250m
requests.memory  150Mi  250Mi

Check events for quota violations

[root@kevin ~]# kubectl get event -n default

LAST SEEN   TYPE      REASON         OBJECT                          MESSAGE
46m         Warning   FailedCreate   replicaset/hpatest-57965d8c84   Error creating: pods "hpatest-57965d8c84-s78x6" is forbidden: exceeded quota: mem-cpu-demo, requested: limits.cpu=400m,limits.memory=400Mi, used: limits.cpu=200m,limits.memory=200Mi, limited: limits.cpu=500m,limits.memory=500Mi
... (additional similar events) ...
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

KubernetesResource ManagementCPUMemoryQoSPod
Ops Development Stories
Written by

Ops Development Stories

Maintained by a like‑minded team, covering both operations and development. Topics span Linux ops, DevOps toolchain, Kubernetes containerization, monitoring, log collection, network security, and Python or Go development. Team members: Qiao Ke, wanger, Dong Ge, Su Xin, Hua Zai, Zheng Ge, Teacher Xia.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.