Mastering Kubernetes Pod Resource Requests, Limits, and QoS
Learn how to properly configure CPU and Memory requests and limits for Kubernetes Pods, understand QoS classes, manage namespace quotas with LimitRange and ResourceQuota, and monitor resource usage using Prometheus queries and Grafana dashboards to ensure stable, efficient cluster operations.
1. Overview
Explain the importance of CPU Request and Memory Request for Pods. If not set, Kubernetes assumes low resource needs and may schedule the Pod anywhere, which can cause resource starvation when the cluster is under pressure.
When resources become scarce, the node may evict Pods; critical Pods (e.g., data storage, login, balance query) must be protected. Kubernetes achieves this through resource quotas, over‑provisioning, and QoS classes that give higher‑priority Pods better guarantees.
The cluster’s compute resources include CPU, GPU, and Memory, but most workloads only need CPU and Memory, which are the focus here.
CPU and Memory are specified per container via resources.requests and resources.limits. The scheduler uses the request values to find a node with sufficient capacity; if none exists, scheduling fails.
2. Pod Resource Usage Guidelines
Pod CPU and Memory usage is dynamic and depends on load; it is expressed as a range (e.g., 0.1‑1 CPU, 500 Mi‑1 Gi memory). The two key parameters are:
Requests : the amount of resources reserved for normal operation.
Limits : the maximum resources a container may consume; for CPU this is a compressible resource, for Memory it is a hard limit.
If a Memory limit is set too low, the container can be killed when it exceeds the limit. Conversely, omitting limits makes the Pod’s usage elastic but less predictable.
When many Pods exist, manually setting all four parameters (CPU request/limit, Memory request/limit) is impractical. Kubernetes provides LimitRange to supply default and allowed values, and ResourceQuota to cap total usage per namespace.
CPU Rules
Units are millicores (m), where 10 m = 0.01 core and 1 core = 1000 m.
Requests are estimated from actual business usage.
Limits are calculated as Requests * 1.2 (i.e., 20 % overhead).
Memory Rules
Units are mebibytes (Mi), where 1024 Mi = 1 Gi.
Requests are estimated from actual usage.
Limits follow the same 20 % overhead formula: Requests * 1.2.
3. Namespace Resource Management Standards
Business requests and limits should not exceed 80 % of the total namespace quota to leave headroom for rolling updates.
3.1 Multi‑Tenant Resource Strategy
Use ResourceQuota to limit the resources a project team can consume.
3.2 Resource Change Process
4. Resource Monitoring and Inspection
4.1 Resource Usage Monitoring
Namespace Requests usage rate
sum (kube_resourcequota{type="used",resource="requests.cpu"}) by (resource,namespace) / sum (kube_resourcequota{type="hard",resource="requests.cpu"}) by (resource,namespace) * 100
sum (kube_resourcequota{type="used",resource="requests.memory"}) by (resource,namespace) / sum (kube_resourcequota{type="hard",resource="requests.memory"}) by (resource,namespace) * 100Namespace Limits usage rate
sum (kube_resourcequota{type="used",resource="limits.cpu"}) by (resource,namespace) / sum (kube_resourcequota{type="hard",resource="limits.cpu"}) by (resource,namespace) * 100
sum (kube_resourcequota{type="used",resource="limits.memory"}) by (resource,namespace) / sum (kube_resourcequota{type="hard",resource="limits.memory"}) by (resource,namespace) * 1004.2 Viewing via Grafana
CPU request rate
sum (kube_resourcequota{type="used",resource="requests.cpu",namespace=~"$NameSpace"}) by (resource,namespace) / sum (kube_resourcequota{type="hard",resource="requests.cpu",namespace=~"$NameSpace"}) by (resource,namespace)Memory request rate
sum (kube_resourcequota{type="used",resource="requests.memory",namespace=~"$NameSpace"}) by (resource,namespace) / sum (kube_resourcequota{type="hard",resource="requests.memory",namespace=~"$NameSpace"}) by (resource,namespace)CPU limit rate
sum (kube_resourcequota{type="used",resource="limits.cpu"}) by (resource,namespace) / sum (kube_resourcequota{type="hard",resource="limits.cpu"}) by (resource,namespace)Memory limit rate
sum (kube_resourcequota{type="used",resource="limits.memory"}) by (resource,namespace) / sum (kube_resourcequota{type="hard",resource="limits.memory"}) by (resource,namespace)4.3 In‑Cluster Resource Inspection
Check resource usage
[root@k8s-dev-slave04 yaml]# kubectl describe resourcequotas -n cloudchain--staging
Name: mem-cpu-demo
Namespace: cloudchain--staging
Resource Used Hard
-------- ---- ----
limits.cpu 200m 500m
limits.memory 200Mi 500Mi
requests.cpu 150m 250m
requests.memory 150Mi 250MiCheck events for quota violations
[root@kevin ~]# kubectl get event -n default
LAST SEEN TYPE REASON OBJECT MESSAGE
46m Warning FailedCreate replicaset/hpatest-57965d8c84 Error creating: pods "hpatest-57965d8c84-s78x6" is forbidden: exceeded quota: mem-cpu-demo, requested: limits.cpu=400m,limits.memory=400Mi, used: limits.cpu=200m,limits.memory=200Mi, limited: limits.cpu=500m,limits.memory=500Mi
... (additional similar lines) ...Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Ops Development Stories
Maintained by a like‑minded team, covering both operations and development. Topics span Linux ops, DevOps toolchain, Kubernetes containerization, monitoring, log collection, network security, and Python or Go development. Team members: Qiao Ke, wanger, Dong Ge, Su Xin, Hua Zai, Zheng Ge, Teacher Xia.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
