Fundamentals 47 min read

Understanding Linux Workqueue Internals: Asynchronous Mechanism and Kernel Thread Coordination

This article explains the low‑level logic of Linux workqueues, detailing how asynchronous tasks are represented, how work items interact with kernel threads, the key data structures involved, scheduling strategies, and practical code examples for creating, queuing, and managing workqueue tasks.

Deepin Linux
Deepin Linux
Deepin Linux
Understanding Linux Workqueue Internals: Asynchronous Mechanism and Kernel Thread Coordination

1. Introduction to Linux Workqueues

In Linux kernel development, workqueues provide a core mechanism for asynchronous task processing, improving system efficiency. Understanding the underlying logic—especially the asynchronous mechanism and its coordination with kernel threads—is essential for mastering kernel task scheduling and building high‑performance asynchronous handlers.

2. Core Concepts

2.1 What Is a Workqueue?

A workqueue is a kernel‑provided mechanism that executes tasks asynchronously using kernel threads. It involves three key concepts:

Work (work_struct) : Represents a unit of work. The most important member is func, a function pointer to the actual task function. Additional members such as data store task‑related state.

Workqueue (workqueue_struct) : A container that holds a series of work items and is associated with worker threads.

Worker thread (worker thread) : The kernel thread that actually runs the work items.

An analogy: the kernel is a factory, work items are orders, the workqueue is the order bin, and worker threads are the workers that pick up and process orders.

2.2 Why Use Workqueues?

Although the kernel already provides softirqs and tasklets, workqueues have distinct advantages:

They run in process context, allowing them to sleep, which is impossible in interrupt context. This makes them suitable for I/O‑bound or resource‑waiting tasks.

Workqueue tasks are scheduled by the kernel scheduler, so they respect task priorities and system load, leading to better overall performance.

They are ideal for non‑time‑critical background work such as filesystem metadata updates, periodic cleanup, or driver initialization.

2.3 Data‑Structure Details

The kernel implements workqueues with a hierarchy of structures: work_struct: Basic work item, containing atomic_long_t data, list_head entry, and work_func_t func. workqueue_struct: Represents a workqueue; holds lists of associated pool_workqueue objects and a name. worker_pool: Manages a pool of worker threads, containing a spinlock, CPU affinity, and lists for pending work. pool_workqueue: Bridges a workqueue with its worker pool, tracking active and maximum concurrent tasks.

2.4 Practical Implementation Example

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

// Define a simple task node
typedef struct TaskNode {
    char task_name[32];   // task name
    struct TaskNode *next; // next node
} TaskNode;

// Define a simple workqueue
typedef struct {
    TaskNode *front; // queue head
    TaskNode *rear;  // queue tail
} WorkQueue;

void InitWorkQueue(WorkQueue *queue) {
    if (queue == NULL) return;
    queue->front = NULL;
    queue->rear = NULL;
}

int Enqueue(WorkQueue *queue, const char *task_name) {
    if (queue == NULL || task_name == NULL) {
        printf("Invalid parameters
");
        return -1;
    }
    TaskNode *new_node = (TaskNode *)malloc(sizeof(TaskNode));
    if (new_node == NULL) {
        printf("Memory allocation failed
");
        return -1;
    }
    strncpy(new_node->task_name, task_name, sizeof(new_node->task_name) - 1);
    new_node->task_name[sizeof(new_node->task_name) - 1] = '\0';
    new_node->next = NULL;
    if (queue->front == NULL) {
        queue->front = new_node;
        queue->rear = new_node;
    } else {
        queue->rear->next = new_node;
        queue->rear = new_node;
    }
    return 0;
}

int Dequeue(WorkQueue *queue, char *out_task, size_t buf_len) {
    if (queue == NULL || out_task == NULL || buf_len == 0) {
        printf("Invalid parameters
");
        return -1;
    }
    if (queue->front == NULL) {
        printf("Queue is empty
");
        return -1;
    }
    TaskNode *temp = queue->front;
    strncpy(out_task, temp->task_name, buf_len - 1);
    out_task[buf_len - 1] = '\0';
    queue->front = queue->front->next;
    free(temp);
    if (queue->front == NULL) {
        queue->rear = NULL;
    }
    return 0;
}

void DestroyWorkQueue(WorkQueue *queue) {
    if (queue == NULL) return;
    TaskNode *p = queue->front;
    while (p != NULL) {
        TaskNode *tmp = p;
        p = p->next;
        free(tmp);
    }
    queue->front = NULL;
    queue->rear = NULL;
}

int main() {
    WorkQueue queue;
    char task_buf[32];
    InitWorkQueue(&queue);
    Enqueue(&queue, "Kernel log inspection task");
    Enqueue(&queue, "System memory reclamation task");
    Enqueue(&queue, "Peripheral interrupt response task");
    if (Dequeue(&queue, task_buf, sizeof(task_buf)) == 0) {
        printf("Executing task: %s
", task_buf);
    }
    if (Dequeue(&queue, task_buf, sizeof(task_buf)) == 0) {
        printf("Executing task: %s
", task_buf);
    }
    DestroyWorkQueue(&queue);
    return 0;
}

This example demonstrates a user‑space simulation of a workqueue, showing task insertion, removal, and cleanup.

3. Asynchronous Execution Mechanism

When a module needs to perform an asynchronous operation, it creates a work_struct and initializes it with INIT_WORK, assigning the real task function to func. The work item is then added to a workqueue via queue_work (or queue_delayed_work for delayed execution). Worker threads continuously poll their associated queues, pick up pending work items, and invoke the stored function. Because the worker runs in process context, it can safely sleep (e.g., msleep) while waiting for I/O.

3.1 Event‑Driven Model

The kernel maintains a persistent event‑loop thread that captures hardware interrupts, network packets, timer expirations, etc., and converts them into standardized work items that are enqueued for asynchronous handling.

3.2 Callback Functions

After a long‑running I/O operation completes, the kernel triggers a callback that enqueues follow‑up work items, forming a closed‑loop asynchronous pipeline without blocking the original thread.

3.3 Lightweight Coroutine Adaptation

For scenarios where full threads are too heavyweight, the kernel can employ setjmp/longjmp‑based coroutines to multiplex multiple logical tasks onto a single kernel thread, reducing context‑switch overhead.

4. Kernel Thread Coordination

4.1 Creation and Management

During system boot, the kernel creates a default worker thread per CPU (named kworker/N:0). Custom workqueues are created with create_workqueue, which internally allocates workqueue_struct and per‑CPU cpu_workqueue_struct. Threads can be bound to specific CPUs (e.g., kworker/u2:0) when needed.

4.2 Execution Flow

Task submission : A driver or subsystem creates a work_struct, sets func, and calls queue_work. Task acquisition : Worker threads poll their queue's linked list. If empty, they sleep (TASK_INTERRUPTIBLE) and wait for a wake‑up. Task execution : The thread removes the first work item and calls work-&gt;func(work) . Because it runs in process context, it may sleep for I/O. Task completion : After the function returns, the work item is marked completed and the thread loops back to check for more work.

4.3 Optimization Strategies

Thread‑pool reuse : Create a fixed set of kernel worker threads at boot and keep them alive, avoiding frequent creation/destruction overhead.

Intelligent load balancing : Monitor per‑thread CPU usage and queue length, dynamically steering work to less‑loaded threads using weighted round‑robin or least‑load algorithms.

Priority scheduling : Assign a priority field to work items and insert them into the queue in order, ensuring high‑priority tasks are processed first.

5. Practical Walk‑through

5.1 Environment Preparation

Install a Linux distribution (e.g., Ubuntu 20.04), the build‑essential package, and matching kernel headers:

sudo apt update
sudo apt install build-essential
sudo apt install linux-headers-$(uname -r)

5.2 Writing a Kernel Module

Define a work function, create a workqueue, and schedule the work:

#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/workqueue.h>

void my_work_func(struct work_struct *work) {
    printk(KERN_INFO "This is my work function. Executing task...
");
    // Insert real task code here (e.g., data processing, file I/O)
}

DECLARE_WORK(my_work, my_work_func);

static struct workqueue_struct *my_wq;

static int __init my_module_init(void) {
    my_wq = create_workqueue("my_wq");
    if (!my_wq) {
        printk(KERN_ERR "Failed to create workqueue
");
        return -ENOMEM;
    }
    if (!queue_work(my_wq, &my_work)) {
        printk(KERN_ERR "Failed to queue work
");
    }
    return 0;
}

static void __exit my_module_exit(void) {
    cancel_work_sync(&my_work);
    destroy_workqueue(my_wq);
}

module_init(my_module_init);
module_exit(my_module_exit);
MODULE_LICENSE("GPL");

The module creates a dedicated workqueue, schedules a work item, and cleans up on unload.

5.3 Debugging Tips

Verify that queue_work returns non‑zero; otherwise, the work was not enqueued.

Use printk at the start of the work function to confirm execution.

If tasks share data, protect it with mutex or spinlock_t to avoid race conditions.

Free any memory allocated with kmalloc using kfree to prevent leaks.

5.4 Observing Results

Build and insert the module:

make -C /lib/modules/$(uname -r)/build M=$(pwd) modules
sudo insmod my_module.ko

Check kernel logs:

dmesg | grep "This is my work function"

Optionally, record timestamps with getnstimeofday to measure execution time, and use smp_processor_id() to log the CPU handling the work.

6. Conclusion

Linux workqueues combine event‑driven models, callbacks, and optional coroutine‑style multiplexing to provide a flexible asynchronous execution framework. By understanding the underlying structures ( work_struct, workqueue_struct, worker_pool, pool_workqueue) and the coordination flow between task submission, worker polling, execution, and completion, developers can design efficient kernel‑level background processing, avoid common pitfalls such as deadlocks or memory leaks, and apply optimization techniques like thread‑pool reuse and load balancing for high‑performance systems.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

KernelAsynchronousCLinuxThreadSchedulingWorkqueue
Deepin Linux
Written by

Deepin Linux

Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.