Cloud Native 16 min read

Understanding Kubernetes CPU Requests vs Limits: The Secrets of Overselling

This article explains how Kubernetes uses CPU requests and limits to implement overselling, detailing the underlying Linux cgroup mechanisms, bandwidth throttling, weight‑based scheduling, and practical configuration tips for SREs to balance guaranteed resources with maximum usage.

IT Services Circle
IT Services Circle
IT Services Circle
Understanding Kubernetes CPU Requests vs Limits: The Secrets of Overselling

1. Container Cloud Technology Stack

Kubernetes (K8s) is the dominant container orchestration platform, supporting Docker and other engines. The stack relies on Linux kernel features such as cgroups, namespaces, and rootfs, with cgroups handling CPU resource limits.

Container Cloud Stack
Container Cloud Stack

2. Container CPU Resource Control

2.1 CPU Bandwidth Limiting

CPU limits are enforced via the cpu cgroup using cpu.cfs_period_us (time period) and cpu.cfs_quota_us (allowed CPU time). Example for limiting a process to two cores:

# cd /sys/fs/cgroup/cpu,cpuacct
# mkdir test
# cd test
# echo 100000 > cpu.cfs_period_us   // 100 ms
# echo 200000 > cpu.cfs_quota_us    // 200 ms
# echo $pid > cgroup.procs

This configuration allows 200 ms of CPU time every 100 ms, effectively capping usage at two cores. The kernel scheduler uses a periodic timer ( period_timer) to refill the runtime quota:

sched_cfs_period_timer
  -> do_sched_cfs_period_timer
    -> __refill_cfs_bandwidth_runtime

The refill function sets cfs_b->runtime = cfs_b->quota, granting the configured CPU time for the next period.

When a task group exhausts its runtime, the scheduler invokes throttle_cfs_rq, removing the task group from the run queue until more time is allocated.

CPU Throttling
CPU Throttling

2.2 CPU Weight Allocation

Beyond bandwidth limiting, the kernel can allocate CPU proportionally using weights. In cgroup v1 this is set via cpu.shares, and in cgroup v2 via cpu.weight or cpu.weight.nice. The weight is stored in the task group's scheduling entity:

struct task_group {
    ...
    unsigned long shares;
};

struct sched_entity {
    struct load_weight load;
    ...
};

struct load_weight {
    unsigned long weight;
};

The fair scheduler scales each entity's virtual runtime ( vruntime) by its weight:

vruntime = (actual_runtime * ((NICE_0_LOAD * 2^32) / weight)) >> 32

Higher weight yields smaller vruntime, granting more CPU time. Example: on an 8‑core machine with total weight 8192, a container with weight 512 receives 0.5 core, weight 1024 receives 1 core, and weight 2048 receives 2 cores.

Weight Distribution
Weight Distribution

3. Requests and Limits Semantics in Kubernetes

A typical pod spec defines resources as follows:

apiVersion: v1
kind: Pod
metadata:
  name: cpu-demo
  namespace: cpu-example
spec:
  containers:
  - name: cpu-demo-ctr
    image: vish/stress
    resources:
      limits:
        cpu: "1"
      requests:
        cpu: "0.5"

Limits translate to cgroup period + quota, setting an upper bound on CPU time. Kubernetes does not enforce that the sum of all containers' limits on a node stays within the node’s core count, allowing overselling.

Requests are implemented via cgroup weight, guaranteeing a minimum share of CPU. In cgroup v1 each requested core maps to a weight of 1024; in cgroup v2 each core maps to roughly 39. The scheduler ensures the sum of all requests on a node does not exceed the total logical cores.

Thus a container’s usable CPU range is [requests, limits]. For example, on an 8‑core node with four identical containers each requesting 2 cores (weight 2048) and limiting to 3 cores (quota 300 ms per 100 ms), the total requests equal 8 cores while total limits equal 12 cores, achieving a 1.5× oversell.

Oversell Illustration
Oversell Illustration

4. Summary

1) Kubernetes uses both limits (hard cap) and requests (guaranteed weight) to efficiently allocate CPU resources and enable overselling.

2) When a pod requests an 8‑core container, the “8 cores” refers to the limits.cpu value; the actual execution may run on any physical core.

3) As an SRE configuring oversell, ensure the sum of requests on a node does not exceed its core count, and set limits to a multiple (e.g., 1.5×) of requests to achieve the desired oversell ratio.

apiVersion: v1
kind: Pod
metadata:
  name: cpu-demo
  namespace: cpu-example
spec:
  containers:
  - name: cpu-demo-ctr
    image: vish/stress
    resources:
      limits:
        cpu: "12"
      requests:
        cpu: "8"

This pod appears as a 12‑core container to users while internally it is oversold at a 1.5× factor.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

KubernetesoversellingcgroupsCPU schedulingrequestsresource-limits
IT Services Circle
Written by

IT Services Circle

Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.