Operations 38 min read

What Is Load Average? Uncovering the Truth Behind System Load Metrics

Load Average measures the average number of runnable and uninterruptible processes over 1, 5, and 15‑minute windows, differs from CPU usage, and can be misinterpreted—this article explains its kernel calculation, how to assess overload, troubleshoot CPU, I/O, or process‑count issues, and handle container‑specific distortions with cgroup v2 and LXCFS.

Raymond Ops
Raymond Ops
Raymond Ops
What Is Load Average? Uncovering the Truth Behind System Load Metrics

Overview

Load Average on Linux is the exponential weighted moving average (EWMA) of the number of processes in the Running/Runnable (R) state and the Uninterruptible Sleep (D) state. The kernel samples the active task count every 5 seconds and updates three time windows (1 min, 5 min, 15 min) using the formula:

load(t) = load(t-1) * e^(-5/60W) + active_tasks * (1 - e^(-5/60W))
# W = 1, 5, 15 (minutes)

Decay factors are approximately 0.92 (1 min), 0.9835 (5 min) and 0.9945 (15 min). The calculation lives in kernel/sched/loadavg.c (function calc_global_load()).

Load Average vs. CPU Utilization

Load Average counts both CPU‑ready processes and those blocked on I/O (D state). CPU utilization (%us) measures only the fraction of CPU time spent executing user code. Two concrete scenarios illustrate the difference:

Scenario A: Load Average = 8, CPU usage = 95%   # CPU‑bound workload
Scenario B: Load Average = 8, CPU usage = 10%   # I/O‑bound workload (many D‑state processes)

Therefore a high Load does not necessarily mean the CPU is saturated.

Determining Whether Load Is Too High

Load is an absolute number; compare it with the number of CPU cores to obtain a load ratio : load_ratio = Load_Average / CPU_core_count Typical thresholds (per‑core ratio) are:

< 0.7 – Healthy, ample headroom.

0.7 – 1.0 – Normal‑high, watch the trend.

1.0 – 2.0 – Overloaded, queues form, latency degrades.

> 2.0 – Severe overload, system becomes sluggish.

> 5.0 – Critical, may be half‑dead.

When Hyper‑Threading is enabled, use the physical core count for a conservative estimate.

Common High‑Load Causes

CPU‑intensive (high us% )

Infinite loops or O(n³) algorithms.

Regular‑expression backtracking (ReDoS).

GC storms in Java/Go (frequent Full GC).

Heavy encryption/compression (TLS handshakes, large file compression).

Cryptomining malware.

Quick check:

# top -bn1 | head -5
# Example output line:
%Cpu(s): 92.3 us, 3.2 sy, 4.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st

If us% dominates, use perf top or language‑specific profilers (e.g., async-profiler for Java, pprof for Go) to locate hot functions.

I/O‑wait (high wa% )

Disk bottleneck (HDD IOPS saturated, SSD write amplification).

Slow database queries (full table scans).

Log‑write storms.

NFS or remote storage latency.

Swap thrashing.

Quick check:

# top -bn1 | head -5
# Example output line:
%Cpu(s): 5.1 us, 3.8 sy, 12.3 id, 78.2 wa, 0.0 hi, 0.6 si, 0.0 st

Then inspect disk I/O with iostat -xz 1 3 (focus on %util and await ) or iotop to pinpoint the offending process.

Process‑count explosion

Fork bomb (malicious or accidental infinite fork).

Oversized worker pool in web servers (Nginx, Apache).

Cron jobs overlapping.

Connection storms spawning many handler processes.

Kubernetes scheduler placing too many Pods on a single node.

Quick check:

# Total processes
ps aux | wc -l
# Count R and D states
ps -eo stat | grep -c "^R"
ps -eo stat | grep -c "^D"
# /proc/loadavg fourth field shows running/total processes
cat /proc/loadavg

If the total process count far exceeds the core count, investigate the source (fork bomb, worker configuration) and apply limits via ulimit or cgroup quotas.

High‑Load Investigation Toolchain

Step 1 – uptime

$ uptime
14:23:05 up 45 days, 3:12, 2 users, load average: 4.52, 3.18, 2.76

Observe the three numbers and their trend (e.g., 28 > 12 > 5 indicates a rapid rise).

Step 2 – top / htop

# Refresh every second
top -d 1

Key fields in the header line: us – user‑mode CPU time. sy – kernel‑mode CPU time. id – idle time (low = busy). wa – I/O wait.

Press 1 to expand per‑CPU view, P to sort by CPU, M to sort by memory.

Step 3 – vmstat vmstat 1 10 Important columns: r – runnable processes (queue length). b – processes in D state (I/O wait). si/so – swap in/out. bi/bo – block device I/O (KB/s).

Step 4 – pidstat

# CPU and I/O per process, 1‑second interval, 5 samples
pidstat -ud 1 5

Shows per‑process %CPU and I/O rates ( kB_rd/s , kB_wr/s ).

Step 5 – perf

# Live hotspot view
perf top -p <PID>
# Record 30 s and generate flame graph
perf record -g -p <PID> -- sleep 30
perf script | stackcollapse-perf.pl | flamegraph.pl > flamegraph.svg

Use for deep code‑level analysis of CPU‑bound hot paths.

/proc Files and PSI

/proc/loadavg

provides five fields:

1‑min average.

5‑min average.

15‑min average. running/total processes (e.g., 12/487).

Most recent PID. /proc/stat first line ( cpu) lists jiffies spent in user, nice, system, idle, iowait, irq, softirq, steal. These are the raw values used by top to compute percentages.

Pressure Stall Information (PSI) introduced in Linux 4.20 gives percentage‑based pressure metrics:

# CPU pressure
cat /proc/pressure/cpu
# IO pressure
cat /proc/pressure/io
# Memory pressure
cat /proc/pressure/memory

Fields some (at least one task blocked) and full (all tasks blocked) are expressed as percentages, making them easier to interpret than raw Load.

cgroup v2 Load Isolation

Modern distributions (Ubuntu 24.04, RHEL 9) enable cgroup v2 by default. The CPU controller provides two key knobs: cpu.max – hard quota/period (e.g., 200000 100000 = 2 CPU cores). cpu.weight – relative share (1‑10000, default 100).

cgroup‑level PSI files ( cpu.pressure, io.pressure) expose per‑group pressure, allowing accurate per‑Pod load monitoring in Kubernetes.

Kubernetes cgroup paths:

Guaranteed QoS: /sys/fs/cgroup/kubepods.slice/kubepods-pod<UID>.slice/ Burstable QoS:

/sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod<UID>.slice/

BestEffort QoS:

/sys/fs/cgroup/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod<UID>.slice/

Reading cpu.stat inside a Pod yields fields such as usage_usec, nr_throttled, and throttled_usec, which indicate actual CPU consumption and throttling.

Container‑Specific Load Pitfalls

Inside a container, /proc/loadavg, /proc/cpuinfo and /proc/stat all reflect the host values because these interfaces are global and not namespace‑aware. Consequently, a container limited to 2 CPU cores may see a host Load of 48 and mistakenly conclude a catastrophic overload.

LXCFS – Virtualising /proc for Containers

LXCFS is a FUSE daemon that intercepts reads of /proc/loadavg, /proc/cpuinfo, /proc/stat, and /proc/meminfo, returning values derived from the container’s cgroup limits.

Typical DaemonSet deployment (YAML omitted for brevity) mounts the FUSE filesystem and then each Pod mounts the virtual files, e.g.:

volumeMounts:
- name: lxcfs-proc-loadavg
  mountPath: /proc/loadavg
- name: lxcfs-proc-cpuinfo
  mountPath: /proc/cpuinfo
- name: lxcfs-proc-stat
  mountPath: /proc/stat

Limitations:

Load values are approximations, not the kernel’s exact EWMA.

FUSE adds a tiny microsecond‑level overhead.

If the LXCFS daemon crashes, the virtual /proc becomes inaccessible.

Each Pod must be explicitly configured (or automated via a MutatingAdmissionWebhook).

Preferred Modern Approach – Direct cgroup Metrics

Instead of faking /proc/loadavg, read native cgroup files:

# CPU usage (microseconds)
cat /sys/fs/cgroup/cpu.stat
# PSI pressure for the cgroup
cat /sys/fs/cgroup/cpu.pressure
cat /sys/fs/cgroup/io.pressure
cat /sys/fs/cgroup/memory.pressure

Kubernetes’ cAdvisor (built into kubelet) already exports these as Prometheus metrics such as container_cpu_usage_seconds_total and container_cpu_cfs_throttled_periods_total, providing a more accurate per‑container view.

Monitoring & Alerting Strategies

Physical/VM Hosts

# Prometheus alert: load ratio > 0.7 for 5m (warning)
- alert: HighLoadAverage
  expr: node_load5 / count without (cpu) (node_cpu_seconds_total{mode="idle"}) > 0.7
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: "Load Average elevated on {{ $labels.instance }}"
    description: "5‑min load ratio = {{ $value | printf \"%.2f\" }}, exceeds warning threshold."

# Critical overload (ratio > 1.0)
- alert: CriticalLoadAverage
  expr: node_load5 / count without (cpu) (node_cpu_seconds_total{mode="idle"}) > 1.0
  for: 5m
  labels:
    severity: critical
  annotations:
    summary: "Load Average critical on {{ $labels.instance }}"
    description: "5‑min load ratio = {{ $value | printf \"%.2f\" }}, system overloaded."

# Sudden surge: 1‑min > 3× 15‑min and > 4
- alert: LoadAverageSurge
  expr: node_load1 / node_load15 > 3 and node_load1 > 4
  for: 2m
  labels:
    severity: warning
  annotations:
    summary: "Load Average surge on {{ $labels.instance }}"
    description: "1‑min load is {{ $value | printf \"%.1f\" }} times the 15‑min average, indicating a burst."

Container / Kubernetes Environments

# Container CPU throttling > 25% (warning)
- alert: ContainerCPUThrottling
  expr: rate(container_cpu_cfs_throttled_periods_total[5m]) / rate(container_cpu_cfs_periods_total[5m]) > 0.25
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: "Container CPU throttling high ({{ $labels.pod }})"
    description: "Throttle ratio = {{ $value | printf \"%.0f\" }}%"

# Container CPU usage near limit > 85% (warning)
- alert: ContainerCPUNearLimit
  expr: rate(container_cpu_usage_seconds_total[5m]) / container_spec_cpu_quota * container_spec_cpu_period > 0.85
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: "Container CPU usage near limit ({{ $labels.pod }})"
    description: "Usage = {{ $value | printf \"%.2f\" }} of configured limit."

# PSI CPU pressure "some" > 20% (warning)
- alert: ContainerCPIPressureHigh
  expr: container_cpu_pressure_seconds_total{type="some"} > 0.2
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: "Container CPU pressure high ({{ $labels.pod }})"
    description: "CPU PSI some = {{ $value }}, exceeds 20%."

Quick Reference Thresholds

Host Load Ratio : warning > 0.7, critical > 1.0 (5 min window).

Host Load Surge : 1‑min / 15‑min > 3 (warning), > 5 (critical).

Container CPU Throttle : warning > 25%, critical > 50%.

Container CPU Usage / Limit : warning > 85%, critical > 95%.

PSI CPU "some" : warning > 20%, critical > 50%.

Key Takeaways

Load Average = EWMA of R + D processes; not a CPU percentage.

Three time windows (1/5/15 min) together reveal load trends.

Divide Load by core count to obtain a load ratio; > 0.7 warrants attention, > 1.0 indicates overload.

High Load originates from three distinct categories – CPU‑bound, I/O‑bound, or process‑count explosion – each requiring a different diagnostic path.

Investigation toolchain: uptimetopvmstatpidstatperf.

In containers, /proc/loadavg shows host values; use cgroup v2 metrics or LXCFS for container‑level visibility.

Monitoring: host alerts based on load ratio and surge; container alerts based on CPU throttling, usage‑to‑limit ratio, and PSI pressure.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

monitoringPerformanceKubernetesLinuxcgroupload-average
Raymond Ops
Written by

Raymond Ops

Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.