Fundamentals 14 min read

Understanding Linux Deadlocks: Spinlocks, Semaphores, and Built‑in Detection Mechanisms

This article explains how deadlocks arise in Linux kernels, compares spinlocks and semaphores, and details the kernel's built‑in D‑state, R‑state, and long‑interrupt‑disable detection mechanisms such as hung‑task, soft‑lockup, and NMI watchdog, including key code snippets.

ITPUB
ITPUB
ITPUB
Understanding Linux Deadlocks: Spinlocks, Semaphores, and Built‑in Detection Mechanisms

Deadlock Overview

A deadlock occurs when two or more processes (or threads) compete for resources and end up waiting for each other indefinitely; without external intervention the system cannot progress. Deadlocks only arise when multiple execution contexts share or communicate resources, such as between processes, threads, or interrupts.

Spinlock Characteristics and Common Pitfalls

Spinlocks busy‑wait (consume CPU cycles) when the resource is unavailable, whereas semaphores put the caller to sleep. Typical spinlock‑related deadlock scenarios include:

Recursive acquisition within the same thread without releasing the lock.

Calling potentially blocking functions (e.g., copy_from_user(), kmalloc()) while holding a spinlock.

Using a spinlock inside an interrupt without disabling interrupts, or acquiring the same spinlock again from an interrupt handler.

Shared‑resource deadlocks between interrupts, bottom halves, and normal processes.

Spinlock States on Different Kernel Configurations

Spinlock behavior depends on CPU count and preemption settings:

Single‑CPU, non‑preemptible kernel: No deadlock because only one execution context runs at a time.

Single‑CPU, preemptible kernel: The kernel may preempt a sleeping task while holding a spinlock, leading to deadlock if the preempted task later tries to reacquire the same lock.

Multi‑CPU, preemptible kernel (SMP): The classic SMP case where a task holding a spinlock can be preempted on another CPU, potentially causing deadlock.

Semaphore Characteristics and Typical Deadlock Cases

Recursive acquisition without release causes the task to sleep forever.

One task obtains a semaphore and blocks; other tasks waiting for the same semaphore also block, creating a deadlock.

Interrupt handlers must not sleep; requesting a semaphore that would block inside an interrupt leads to deadlock.

Two processes each hold a semaphore the other needs, forming a classic circular wait.

Linux Built‑in D‑State Deadlock Detection (Hung‑Task)

A D‑state deadlock is when a task stays in TASK_UNINTERRUPTIBLE for a long period (default 120 s). The kernel uses the hung‑task mechanism, implemented mainly in hung_task.c, to detect such tasks.

static int __init hung_task_init(void)
{
    atomic_notifier_chain_register(&panic_notifier_list, &panic_block);
    watchdog_task = kthread_run(watchdog, NULL, "khungtaskd");
    return 0;
}

static int watchdog(void *dummy)
{
    set_user_nice(current, 0);
    unsigned long timeout = sysctl_hung_task_timeout_secs;
    while (schedule_timeout_interruptible(timeout_jiffies(timeout)))
        timeout = sysctl_hung_task_timeout_secs;
    check_hung_uninterruptible_tasks(timeout);
    return 0;
}

static void check_hung_uninterruptible_tasks(unsigned long timeout)
{
    int max_count = sysctl_hung_task_check_count;
    int batch_count = HUNG_TASK_BATCHING;
    struct task_struct *g, *t;
    rcu_read_lock();
    do_each_thread(g, t) {
        if (!max_count--)
            goto unlock;
        if (!--batch_count) {
            batch_count = HUNG_TASK_BATCHING;
            rcu_lock_break(g, t);
            if (t->state == TASK_DEAD || g->state == TASK_DEAD)
                goto unlock;
        }
        if (t->state == TASK_UNINTERRUPTIBLE)
            check_hung_task(t, timeout);
    } while_each_thread(g, t);
unlock:
    rcu_read_unlock();
}

static void check_hung_task(struct task_struct *t, unsigned long timeout)
{
    unsigned long switch_count = t->nvcsw + t->nivcsw;
    /* additional analysis omitted for brevity */
}

The watchdog thread runs every 120 seconds, scans the task list, and prints stack traces for tasks that have not switched state.

Linux Built‑in R‑State Deadlock Detection (Soft‑Lockup)

An R‑state deadlock occurs when a task stays in TASK_RUNNING for an extended period (default 60 s), monopolizing the CPU. The kernel detects this via the soft‑lockup mechanism in softlockup.c.

static int __init spawn_softlockup_task(void)
{
    void *cpu = (void *)(long)smp_processor_id();
    int err;
    if (nosoftlockup)
        return 0;
    err = cpu_callback(&cpu_nfb, CPU_UP_PREPARE, cpu);
    if (err == NOTIFY_BAD) {
        BUG();
        return 1;
    }
    cpu_callback(&cpu_nfb, CPU_ONLINE, cpu);
    register_cpu_notifier(&cpu_nfb);
    atomic_notifier_chain_register(&panic_notifier_list, &panic_block);
    return 0;
}

static int __cpuinit cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu)
{
    int hotcpu = (unsigned long)hcpu;
    struct task_struct *p;
    switch (action) {
    case CPU_UP_PREPARE:
    case CPU_UP_PREPARE_FROZEN:
        BUG_ON(per_cpu(softlockup_watchdog, hotcpu));
        p = kthread_create(watchdog, hcpu, "watchdog/%d", hotcpu);
        if (IS_ERR(p)) {
            printk(KERN_ERR "watchdog for %i failed
", hotcpu);
            return NOTIFY_BAD;
        }
        per_cpu(softlockup_touch_ts, hotcpu) = 0;
        per_cpu(softlockup_watchdog, hotcpu) = p;
        kthread_bind(p, hotcpu);
        break;
    case CPU_ONLINE:
    case CPU_ONLINE_FROZEN:
        wake_up_process(per_cpu(softlockup_watchdog, hotcpu));
        break;
    }
    return NOTIFY_OK;
}

static int watchdog(void *__bind_cpu)
{
    struct sched_param param = { .sched_priority = MAX_RT_PRIO-1 };
    sched_setscheduler(current, SCHED_FIFO, ¶m);
    __touch_softlockup_watchdog();
    set_current_state(TASK_INTERRUPTIBLE);
    while (!kthread_should_stop()) {
        __touch_softlockup_watchdog();
        schedule();
        if (kthread_should_stop())
            break;
        set_current_state(TASK_INTERRUPTIBLE);
    }
    __set_current_state(TASK_RUNNING);
    return 0;
}

void softlockup_tick(void)
{
    int this_cpu = smp_processor_id();
    unsigned long touch_ts = per_cpu(softlockup_touch_ts, this_cpu);
    unsigned long now;
    if (!per_cpu(softlockup_watchdog, this_cpu) || softlockup_thresh <= 0)
        return;
    /* compare timestamps; if elapsed > threshold, print warning */
}

The soft‑lockup watchdog is a FIFO‑scheduled kernel thread that periodically clears a per‑CPU timestamp. The timer interrupt hook softlockup_tick checks whether the timestamp has been refreshed within the threshold; otherwise it logs a warning.

Long Interrupt‑Disable Detection (NMI Watchdog)

The NMI watchdog uses a hardware counter that must be periodically cleared by the kernel. If a task disables interrupts for too long, the counter is not cleared, causing a non‑maskable interrupt that triggers a system reset, thereby exposing prolonged interrupt‑disable situations.

Conclusion

Linux provides three complementary deadlock detection mechanisms: hung‑task for uninterruptible D‑state stalls, soft‑lockup for CPU‑monopolizing R‑state stalls, and the NMI watchdog for excessively long interrupt‑disable periods. Understanding the underlying lock types (spinlocks vs. semaphores) and their proper usage helps avoid these deadlocks.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

deadlockLinuxsemaphoreSpinlockhungtasksoftlockup
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.