Operations 12 min read

How Linux’s CPU‑Idle Framework Saves Power: Inside the Menu Governor

This article explains the Linux CPU‑idle framework, detailing the idle governor architecture, the entry and exit flow, key data structures, and the menu governor’s prediction and correction algorithms that balance power savings with performance latency.

OPPO Kernel Craftsman
OPPO Kernel Craftsman
OPPO Kernel Craftsman
How Linux’s CPU‑Idle Framework Saves Power: Inside the Menu Governor

1. Overview of the CPU idle framework

Modern operating systems avoid running CPU‑intensive loops when idle to save energy; instead, the CPU enters a low‑power idle state. In Linux, when no task is runnable, the scheduler switches to the idle process, which calls cpuidle_idle_call and lets the cpuidle framework choose an appropriate idle state.

The framework consists of three parts: the cpu idle governor, the cpu idle driver, and the cpu idle core. The governor selects a policy, the driver implements hardware‑specific actions, and the core manages state transitions.

2. CPU idle entry and exit flow

During boot each CPU creates an idle thread; after initialization the init thread becomes the idle thread and enters an infinite idle loop. In do_idle() the kernel polls the scheduler; if no task needs CPU, it proceeds to the idle path: do_idle() → cpuidle_idle_call() → cpuidle_select() The selected governor then chooses an idle state. After entering, the CPU executes the WFI (wait‑for‑interrupt) instruction and remains there until an interrupt wakes it.

When an idle state is chosen, the framework records information via cpuidle_reflect() for use in the next selection.

3. CPU idle governor

3.1 Overview

The governor provides the policy for using idle states. Linux offers two governors: Menu and Ladder . Ladder follows a shallow‑to‑deep progression, suitable for periodic‑tick systems, while Menu can jump directly to the deepest beneficial state, making it preferable for tick‑less systems, which are common today.

3.2 Decision factors

Menu governor balances two factors:

Energy balance point (residency time) : entering a C‑state consumes energy; the governor predicts how long the CPU will stay idle and compares it to the state’s target_residency.

Performance impact (latency tolerance) : deeper states have higher exit latency; if the system is busy, deep states are avoided.

These translate into two tasks: predicting the idle duration and calculating the system’s latency tolerance.

3.3 Core data structures

Key structures:

struct cpuidle_state – describes a C‑state:

name – state name

desc – brief description

exit_latency – exit latency (µs)

power_usage – power consumption

target_residency – desired minimum residency (µs)

enter – callback invoked on entry

struct cpuidle_governor (Menu governor):

last_state_idx – index of the previously selected state

need_update – flag set after each exit to trigger reflect tick_wakeup – indicates if the last wake‑up was caused by a tick

next_timer_us – time until the next timer event

bucket – position in the correction‑factor table

correction_factor – array of factors used to adjust predictions

intervals / interval_ptr – recent residency samples for variance calculation

3.4 Core functions

The heart of the Menu governor is menu_select(), which performs:

Compute a baseline predicted_us from next_timer_us.

Apply a correction factor selected from the bucket to adjust the prediction.

Optionally replace the prediction with the average of the last eight residency samples if their standard deviation is low.

Take the minimum of the two predictions.

The governor also calculates two system‑tolerance values: one from the performance multiplier and another from predicted_us / (1 + 10 * iowaiters), where iowaiters is the number of I/O‑waiting tasks on the CPU. The smaller of these tolerances is used.

Finally, the governor selects the deepest C‑state whose target_residency is less than predicted_us and whose exit_latency satisfies the system’s latency requirement. After the CPU exits idle, the governor updates its internal statistics via menu_update(), adjusting the correction factor with:

new_factor += RESOLUTION * measured_us / data->next_timer_us; // RESOLUTION = 1024

This feedback loop enables the Menu governor to adaptively choose idle states that minimize power while respecting performance constraints.

References

[1] https://www.kernel.org/doc/html/v5.0/admin-guide/pm/cpuidle.html

[2] http://www.wowotech.net/pm_subsystem/cpuidle_menu_governor.html

[3] https://www.cnblogs.com/LoyenWang/p/11379937.html

[4] S. Pattanayak and B. Thangaraju, “Linux CPU‑Idle Menu Governor with Online Reinforcement Learning and Scheduler Load Balancing Statistics,” 2019 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT), Bangalore, India, 2019.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

KernelLinuxOperating Systemspower managementcpu-idleMenu governor
OPPO Kernel Craftsman
Written by

OPPO Kernel Craftsman

Sharing Linux kernel-related cutting-edge technology, technical articles, technical news, and curated tutorials

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.