Fundamentals 12 min read

Linux CPU Power Management: P‑states and C‑states in Kernel 2.6 and 4.18

This article explains how Linux kernels 2.6 and 4.18 manage CPU P‑states and C‑states, covering BIOS control, cpupower usage, idle drivers, the menu governor, MONITOR/MWAIT effects, and performance comparisons between power‑saving and performance modes on Intel Xeon E5‑2630 v4.

58 Tech
58 Tech
58 Tech
Linux CPU Power Management: P‑states and C‑states in Kernel 2.6 and 4.18

In the previous section we discussed several CPU performance states and how the MONITOR/MWAIT instructions can affect Linux performance.

Linux manages CPU power mainly through P‑states (performance vs. powersave frequency limits) and C‑states (idle depth). BIOS may allow the OS to control P‑states; when enabled, the cpupower utility manipulates files under /sys/devices/system/cpu/cpu*/cpufreq/ to set the desired policy.

The kernel calculates the next CPU frequency each timer tick using the formula (1.25 * max_frequency * CPU_utilization) , maps it to a P‑state level, and instructs the CPU to adjust voltage accordingly.

C‑states have only enable/disable options. When enabled, the deepest reachable state (e.g., C6) depends on BIOS settings and the MONITOR/MWAIT configuration. Intel’s Enhanced Halt State (C1E) further reduces voltage in C1.

Kernel 2.6 provides four built‑in idle methods (poll_idle, mwait_idle, c1e_idle, default_idle) and driver‑based idle via cpuidle_idle_call . The idle method can be forced with the kernel boot parameter idle= . Two drivers are available: intel_idle (requires Intel CPU with MWAIT support) and acpi_idle (requires deeper sleep support). The active driver determines which idle instructions (halt, mwait, inb) are used.

The menu governor selects the next C‑state based on predicted residency time and the system latency request (default 2000 µs, configurable via /dev/cpu_dma_latency or /sys/kernel/debug/pm_qos/cpu_dma_latency with the tuned service).

In practice, disabling MONITOR/MWAIT limits the CPU to C1 even if C‑states are enabled.

Kernel 4.18 makes cpuidle_idle_call the default idle method. When an Intel driver is present, it uses only the MONITOR/MWAIT instruction. The menu governor works similarly, but the latency interface moved to /sys/kernel/debug/pm_qos/cpu_dma_latency .

Performance tests on an Intel Xeon E5‑2630 v4 show that the power‑saving mode (with higher turbo boost frequency and deeper C‑states) can be 20 % faster than the performance mode for single‑threaded workloads, while the performance mode excels in low‑latency, high‑concurrency scenarios.

Recommendations: for latency‑sensitive services, disable P‑states in BIOS and keep C‑states enabled with performance policy; for general workloads, enable P‑states with the performance policy, enable Turbo Boost, and let the BIOS manage power.

KernelLinuxCPUpower managementC-statesP-states
58 Tech
Written by

58 Tech

Official tech channel of 58, a platform for tech innovation, sharing, and communication.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.