Understanding Linux Load Average: Calculation, Exposure, and Relation to CPU Usage
Linux load average, a key performance metric, is computed by periodically aggregating per‑CPU runnable and uninterruptible task counts into a global instantaneous load, then applying an exponential weighted moving average to produce 1‑, 5‑, and 15‑minute averages, which are exposed to user space via the /proc/loadavg pseudo‑file.
1. Understanding the Load Viewing Process
We often use the top command to view the system load. A typical top output shows three numbers representing the average load over the past 1, 5, and 15 minutes.
# top
Load Avg: 1.25, 1.30, 1.95 ...The values displayed by top are read from the pseudo‑file /proc/loadavg . By tracing top with strace , we can see that it opens this file.
# strace top
...
openat(AT_FDCWD, "/proc/loadavg", O_RDONLY) = 7The kernel defines the open function for /proc/loadavg . When a user‑space process accesses the file, the kernel reads the internal average‑load variables, formats them, and returns the result.
The implementation resides in fs/proc/loadavg.c , where the file is created and its file‑operations structure ( loadavg_proc_fops ) is set.
// file: fs/proc/loadavg.c
static int __init proc_loadavg_init(void)
{
proc_create("loadavg", 0, NULL, &loadavg_proc_fops);
return 0;
}The open method points to loadavg_proc_open , which eventually calls loadavg_proc_show to produce the output.
// file: fs/proc/loadavg.c
static int loadavg_proc_show(struct seq_file *m, void *v)
{
unsigned long avnrun[3];
get_avenrun(avnrun, FIXED_1/200, 0);
seq_printf(m, "%lu.%02lu %lu.%02lu %lu.%02lu %ld/%d %d\n",
LOAD_INT(avnrun[0]), LOAD_FRAC(avnrun[0]),
LOAD_INT(avnrun[1]), LOAD_FRAC(avnrun[1]),
LOAD_INT(avnrun[2]), LOAD_FRAC(avnrun[2]),
nr_running(), nr_threads,
task_active_pid_ns(current)->last_pid);
return 0;
}In loadavg_proc_show two actions occur: get_avenrun fetches the current load values, and the function formats and prints them.
2. Kernel Calculation of Load
The kernel calculates load in two stages:
Per‑CPU periodic aggregation of instantaneous load.
Timed computation of the 1‑, 5‑, and 15‑minute averages using an exponential weighted moving average (EWMA).
2.1 Per‑CPU Periodic Aggregation
A high‑resolution timer in the time subsystem periodically runs on each CPU, refreshing the number of running and uninterruptible tasks into a global variable calc_load_tasks .
// file: kernel/time/tick-sched.c
void tick_setup_sched_timer(void)
{
hrtimer_init(&ts->sched_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
ts->sched_timer.function = tick_sched_timer;
...
}The timer invokes tick_sched_timer , which eventually calls scheduler_tick . This function updates the per‑CPU run‑queue load and adds the delta to the global instantaneous load.
// file: kernel/sched/core.c
void scheduler_tick(void)
{
int cpu = smp_processor_id();
struct rq *rq = cpu_rq(cpu);
update_cpu_load_active(rq);
...
}
static void update_cpu_load_active(struct rq *this_rq)
{
calc_load_account_active(this_rq);
}
static void calc_load_account_active(struct rq *this_rq)
{
long delta = calc_load_fold_active(this_rq);
if (delta)
atomic_long_add(delta, &calc_load_tasks);
}
static long calc_load_fold_active(struct rq *this_rq)
{
long nr_active = this_rq->nr_running + (long)this_rq->nr_uninterruptible;
long delta = 0;
if (nr_active != this_rq->calc_load_active) {
delta = nr_active - this_rq->calc_load_active;
this_rq->calc_load_active = nr_active;
}
return delta;
}Thus calc_load_tasks always holds the current instantaneous load of the whole system.
2.2 Timed Computation of Average Load
On each timer tick the kernel calls calc_global_load , which reads calc_load_tasks and updates the three averages using EWMA.
// file: kernel/sched/core.c
void calc_global_load(unsigned long ticks)
{
long active = atomic_long_read(&calc_load_tasks);
avenrun[0] = calc_load(avenrun[0], EXP_1, active);
avenrun[1] = calc_load(avenrun[1], EXP_5, active);
avenrun[2] = calc_load(avenrun[2], EXP_15, active);
...
}
static unsigned long calc_load(unsigned long load, unsigned long exp, unsigned long active)
{
load *= exp;
load += active * (FIXED_1 - exp);
load += 1UL << (FSHIFT - 1);
return load >> FSHIFT;
}The EWMA formula ( a1 = a0 * e + a * (1 - e) ) gives more weight to recent samples, requiring only the previous average and the current instantaneous load, thus avoiding large memory buffers.
3. Relationship Between Load Average and CPU Consumption
Older Linux kernels counted only runnable tasks, making load directly proportional to CPU usage. Modern kernels also include tasks in the uninterruptible (D) state, which do not consume CPU but indicate waiting for I/O or other resources. Therefore a high load may reflect CPU saturation, I/O bottlenecks, or other resource constraints.
A 1993 patch added TASK_UNINTERRUPTIBLE (and later TASK_SWAPPING ) to the load calculation, emphasizing that load should represent overall system demand, not just CPU demand.
4. Summary
The Linux load average is computed in three steps:
The kernel periodically aggregates per‑CPU runnable and uninterruptible task counts into a global instantaneous load.
An exponential weighted moving average converts this instantaneous load into 1‑, 5‑, and 15‑minute averages.
User‑space processes read the averages from the pseudo‑file /proc/loadavg .
Consequently, load reflects the overall demand for system resources; a high value may indicate CPU pressure, I/O contention, or other bottlenecks.
IT Services Circle
Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.