What Is Load Average? Uncovering the Truth Behind System Load Metrics
Load Average measures the average number of runnable and uninterruptible processes over 1, 5, and 15‑minute windows, differs from CPU usage, and can be misinterpreted—this article explains its kernel calculation, how to assess overload, troubleshoot CPU, I/O, or process‑count issues, and handle container‑specific distortions with cgroup v2 and LXCFS.
Overview
Load Average on Linux is the exponential weighted moving average (EWMA) of the number of processes in the Running/Runnable (R) state and the Uninterruptible Sleep (D) state. The kernel samples the active task count every 5 seconds and updates three time windows (1 min, 5 min, 15 min) using the formula:
load(t) = load(t-1) * e^(-5/60W) + active_tasks * (1 - e^(-5/60W))
# W = 1, 5, 15 (minutes)Decay factors are approximately 0.92 (1 min), 0.9835 (5 min) and 0.9945 (15 min). The calculation lives in kernel/sched/loadavg.c (function calc_global_load()).
Load Average vs. CPU Utilization
Load Average counts both CPU‑ready processes and those blocked on I/O (D state). CPU utilization (%us) measures only the fraction of CPU time spent executing user code. Two concrete scenarios illustrate the difference:
Scenario A: Load Average = 8, CPU usage = 95% # CPU‑bound workload
Scenario B: Load Average = 8, CPU usage = 10% # I/O‑bound workload (many D‑state processes)Therefore a high Load does not necessarily mean the CPU is saturated.
Determining Whether Load Is Too High
Load is an absolute number; compare it with the number of CPU cores to obtain a load ratio : load_ratio = Load_Average / CPU_core_count Typical thresholds (per‑core ratio) are:
< 0.7 – Healthy, ample headroom.
0.7 – 1.0 – Normal‑high, watch the trend.
1.0 – 2.0 – Overloaded, queues form, latency degrades.
> 2.0 – Severe overload, system becomes sluggish.
> 5.0 – Critical, may be half‑dead.
When Hyper‑Threading is enabled, use the physical core count for a conservative estimate.
Common High‑Load Causes
CPU‑intensive (high us% )
Infinite loops or O(n³) algorithms.
Regular‑expression backtracking (ReDoS).
GC storms in Java/Go (frequent Full GC).
Heavy encryption/compression (TLS handshakes, large file compression).
Cryptomining malware.
Quick check:
# top -bn1 | head -5
# Example output line:
%Cpu(s): 92.3 us, 3.2 sy, 4.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 stIf us% dominates, use perf top or language‑specific profilers (e.g., async-profiler for Java, pprof for Go) to locate hot functions.
I/O‑wait (high wa% )
Disk bottleneck (HDD IOPS saturated, SSD write amplification).
Slow database queries (full table scans).
Log‑write storms.
NFS or remote storage latency.
Swap thrashing.
Quick check:
# top -bn1 | head -5
# Example output line:
%Cpu(s): 5.1 us, 3.8 sy, 12.3 id, 78.2 wa, 0.0 hi, 0.6 si, 0.0 stThen inspect disk I/O with iostat -xz 1 3 (focus on %util and await ) or iotop to pinpoint the offending process.
Process‑count explosion
Fork bomb (malicious or accidental infinite fork).
Oversized worker pool in web servers (Nginx, Apache).
Cron jobs overlapping.
Connection storms spawning many handler processes.
Kubernetes scheduler placing too many Pods on a single node.
Quick check:
# Total processes
ps aux | wc -l
# Count R and D states
ps -eo stat | grep -c "^R"
ps -eo stat | grep -c "^D"
# /proc/loadavg fourth field shows running/total processes
cat /proc/loadavgIf the total process count far exceeds the core count, investigate the source (fork bomb, worker configuration) and apply limits via ulimit or cgroup quotas.
High‑Load Investigation Toolchain
Step 1 – uptime
$ uptime
14:23:05 up 45 days, 3:12, 2 users, load average: 4.52, 3.18, 2.76Observe the three numbers and their trend (e.g., 28 > 12 > 5 indicates a rapid rise).
Step 2 – top / htop
# Refresh every second
top -d 1Key fields in the header line: us – user‑mode CPU time. sy – kernel‑mode CPU time. id – idle time (low = busy). wa – I/O wait.
Press 1 to expand per‑CPU view, P to sort by CPU, M to sort by memory.
Step 3 – vmstat vmstat 1 10 Important columns: r – runnable processes (queue length). b – processes in D state (I/O wait). si/so – swap in/out. bi/bo – block device I/O (KB/s).
Step 4 – pidstat
# CPU and I/O per process, 1‑second interval, 5 samples
pidstat -ud 1 5Shows per‑process %CPU and I/O rates ( kB_rd/s , kB_wr/s ).
Step 5 – perf
# Live hotspot view
perf top -p <PID>
# Record 30 s and generate flame graph
perf record -g -p <PID> -- sleep 30
perf script | stackcollapse-perf.pl | flamegraph.pl > flamegraph.svgUse for deep code‑level analysis of CPU‑bound hot paths.
/proc Files and PSI
/proc/loadavgprovides five fields:
1‑min average.
5‑min average.
15‑min average. running/total processes (e.g., 12/487).
Most recent PID. /proc/stat first line ( cpu) lists jiffies spent in user, nice, system, idle, iowait, irq, softirq, steal. These are the raw values used by top to compute percentages.
Pressure Stall Information (PSI) introduced in Linux 4.20 gives percentage‑based pressure metrics:
# CPU pressure
cat /proc/pressure/cpu
# IO pressure
cat /proc/pressure/io
# Memory pressure
cat /proc/pressure/memoryFields some (at least one task blocked) and full (all tasks blocked) are expressed as percentages, making them easier to interpret than raw Load.
cgroup v2 Load Isolation
Modern distributions (Ubuntu 24.04, RHEL 9) enable cgroup v2 by default. The CPU controller provides two key knobs: cpu.max – hard quota/period (e.g., 200000 100000 = 2 CPU cores). cpu.weight – relative share (1‑10000, default 100).
cgroup‑level PSI files ( cpu.pressure, io.pressure) expose per‑group pressure, allowing accurate per‑Pod load monitoring in Kubernetes.
Kubernetes cgroup paths:
Guaranteed QoS: /sys/fs/cgroup/kubepods.slice/kubepods-pod<UID>.slice/ Burstable QoS:
/sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod<UID>.slice/BestEffort QoS:
/sys/fs/cgroup/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod<UID>.slice/Reading cpu.stat inside a Pod yields fields such as usage_usec, nr_throttled, and throttled_usec, which indicate actual CPU consumption and throttling.
Container‑Specific Load Pitfalls
Inside a container, /proc/loadavg, /proc/cpuinfo and /proc/stat all reflect the host values because these interfaces are global and not namespace‑aware. Consequently, a container limited to 2 CPU cores may see a host Load of 48 and mistakenly conclude a catastrophic overload.
LXCFS – Virtualising /proc for Containers
LXCFS is a FUSE daemon that intercepts reads of /proc/loadavg, /proc/cpuinfo, /proc/stat, and /proc/meminfo, returning values derived from the container’s cgroup limits.
Typical DaemonSet deployment (YAML omitted for brevity) mounts the FUSE filesystem and then each Pod mounts the virtual files, e.g.:
volumeMounts:
- name: lxcfs-proc-loadavg
mountPath: /proc/loadavg
- name: lxcfs-proc-cpuinfo
mountPath: /proc/cpuinfo
- name: lxcfs-proc-stat
mountPath: /proc/statLimitations:
Load values are approximations, not the kernel’s exact EWMA.
FUSE adds a tiny microsecond‑level overhead.
If the LXCFS daemon crashes, the virtual /proc becomes inaccessible.
Each Pod must be explicitly configured (or automated via a MutatingAdmissionWebhook).
Preferred Modern Approach – Direct cgroup Metrics
Instead of faking /proc/loadavg, read native cgroup files:
# CPU usage (microseconds)
cat /sys/fs/cgroup/cpu.stat
# PSI pressure for the cgroup
cat /sys/fs/cgroup/cpu.pressure
cat /sys/fs/cgroup/io.pressure
cat /sys/fs/cgroup/memory.pressureKubernetes’ cAdvisor (built into kubelet) already exports these as Prometheus metrics such as container_cpu_usage_seconds_total and container_cpu_cfs_throttled_periods_total, providing a more accurate per‑container view.
Monitoring & Alerting Strategies
Physical/VM Hosts
# Prometheus alert: load ratio > 0.7 for 5m (warning)
- alert: HighLoadAverage
expr: node_load5 / count without (cpu) (node_cpu_seconds_total{mode="idle"}) > 0.7
for: 5m
labels:
severity: warning
annotations:
summary: "Load Average elevated on {{ $labels.instance }}"
description: "5‑min load ratio = {{ $value | printf \"%.2f\" }}, exceeds warning threshold."
# Critical overload (ratio > 1.0)
- alert: CriticalLoadAverage
expr: node_load5 / count without (cpu) (node_cpu_seconds_total{mode="idle"}) > 1.0
for: 5m
labels:
severity: critical
annotations:
summary: "Load Average critical on {{ $labels.instance }}"
description: "5‑min load ratio = {{ $value | printf \"%.2f\" }}, system overloaded."
# Sudden surge: 1‑min > 3× 15‑min and > 4
- alert: LoadAverageSurge
expr: node_load1 / node_load15 > 3 and node_load1 > 4
for: 2m
labels:
severity: warning
annotations:
summary: "Load Average surge on {{ $labels.instance }}"
description: "1‑min load is {{ $value | printf \"%.1f\" }} times the 15‑min average, indicating a burst."Container / Kubernetes Environments
# Container CPU throttling > 25% (warning)
- alert: ContainerCPUThrottling
expr: rate(container_cpu_cfs_throttled_periods_total[5m]) / rate(container_cpu_cfs_periods_total[5m]) > 0.25
for: 5m
labels:
severity: warning
annotations:
summary: "Container CPU throttling high ({{ $labels.pod }})"
description: "Throttle ratio = {{ $value | printf \"%.0f\" }}%"
# Container CPU usage near limit > 85% (warning)
- alert: ContainerCPUNearLimit
expr: rate(container_cpu_usage_seconds_total[5m]) / container_spec_cpu_quota * container_spec_cpu_period > 0.85
for: 5m
labels:
severity: warning
annotations:
summary: "Container CPU usage near limit ({{ $labels.pod }})"
description: "Usage = {{ $value | printf \"%.2f\" }} of configured limit."
# PSI CPU pressure "some" > 20% (warning)
- alert: ContainerCPIPressureHigh
expr: container_cpu_pressure_seconds_total{type="some"} > 0.2
for: 5m
labels:
severity: warning
annotations:
summary: "Container CPU pressure high ({{ $labels.pod }})"
description: "CPU PSI some = {{ $value }}, exceeds 20%."Quick Reference Thresholds
Host Load Ratio : warning > 0.7, critical > 1.0 (5 min window).
Host Load Surge : 1‑min / 15‑min > 3 (warning), > 5 (critical).
Container CPU Throttle : warning > 25%, critical > 50%.
Container CPU Usage / Limit : warning > 85%, critical > 95%.
PSI CPU "some" : warning > 20%, critical > 50%.
Key Takeaways
Load Average = EWMA of R + D processes; not a CPU percentage.
Three time windows (1/5/15 min) together reveal load trends.
Divide Load by core count to obtain a load ratio; > 0.7 warrants attention, > 1.0 indicates overload.
High Load originates from three distinct categories – CPU‑bound, I/O‑bound, or process‑count explosion – each requiring a different diagnostic path.
Investigation toolchain: uptime → top → vmstat → pidstat → perf.
In containers, /proc/loadavg shows host values; use cgroup v2 metrics or LXCFS for container‑level visibility.
Monitoring: host alerts based on load ratio and surge; container alerts based on CPU throttling, usage‑to‑limit ratio, and PSI pressure.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Raymond Ops
Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
