Fundamentals 23 min read

Unveiling Linux Process Creation: How Nginx Forks Workers and the Kernel Builds task_struct

This article provides a deep, step‑by‑step exploration of Linux process creation, using Nginx’s worker forking as a concrete example, and walks through the task_struct layout, process states, PID management, address‑space handling, file system structures, namespaces, and the internal logic of the fork system call.

Liangxu Linux
Liangxu Linux
Liangxu Linux
Unveiling Linux Process Creation: How Nginx Forks Workers and the Kernel Builds task_struct

Nginx creates worker processes with fork

The Nginx master process spawns a configurable number of workers by looping over ngx_spawn_process in src/os/unix/ngx_process_cycle.c:

// file: src/os/unix/ngx_process_cycle.c
static void ngx_start_worker_processes(...){
    for (i = 0; i < n; i++) {
        ngx_spawn_process(cycle, ngx_worker_process_cycle,
            (void *)(intptr_t)i, "worker process", type);
    }
}
ngx_spawn_process

(see src/os/unix/ngx_process.c) simply calls fork() and, on success, runs the worker entry function:

// file: src/os/unix/ngx_process.c
ngx_pid_t ngx_spawn_process(ngx_cycle_t *cycle, ngx_spawn_proc_pt proc, ...){
    pid = fork();
    switch (pid) {
        case -1: /* error */ ...
        case 0:  /* child */
            proc(cycle, data);
            break;
        ...
    }
    ...
}

Linux internal representation of a process

Every Linux task is described by struct task_struct defined in include/linux/sched.h. The most relevant fields are:

// file: include/linux/sched.h
struct task_struct {
    volatile long state;               // process state flags
    pid_t pid;                         // thread PID
    pid_t tgid;                        // thread‑group ID
    struct task_struct __rcu *parent;  // parent pointer
    struct list_head children;         // child list
    struct list_head sibling;          // sibling list
    struct task_struct *group_leader;  // leader of thread group
    int prio, static_prio, normal_prio; // scheduling priorities
    unsigned int rt_priority;          // real‑time priority
    struct mm_struct *mm, *active_mm;  // address space
    struct fs_struct *fs;              // cwd, root
    struct files_struct *files;        // open file table
    struct nsproxy *nsproxy;           // namespaces
    ...
};

Process state

The state field holds flag values such as TASK_RUNNING, TASK_INTERRUPTIBLE, TASK_UNINTERRUPTIBLE, etc., defined in include/linux/sched.h:

#define TASK_RUNNING          0
#define TASK_INTERRUPTIBLE    1
#define TASK_UNINTERRUPTIBLE  2
#define __TASK_STOPPED        4
#define __TASK_TRACED         8
/* ... */
#define TASK_DEAD            64
#define TASK_WAKEKILL        128
#define TASK_WAKING          256
#define TASK_PARKED          512
#define TASK_STATE_MAX      1024

PID and thread group

Each task has a unique pid. For a single‑threaded process pid and tgid are identical. PIDs are allocated from a bitmap inside the PID namespace, which minimizes memory usage.

Process tree

The parent, children and sibling pointers form a tree rooted at the init process. Tools like pstree visualize this hierarchy.

Scheduling priority

static_prio : static priority (range 100‑139, set via nice)

rt_priority : real‑time priority (0‑99)

prio : dynamic priority used by the scheduler

normal_prio : derived from static priority and scheduling policy

Address space ( mm_struct )

The mm_struct (defined in include/linux/mm_types.h) describes a user process’s virtual memory layout:

// file: include/linux/mm_types.h
struct mm_struct {
    struct vm_area_struct *mmap;   // list of VMAs
    struct rb_root mm_rb;
    unsigned long mmap_base;
    unsigned long task_size;
    unsigned long start_code, end_code;
    unsigned long start_data, end_data;
    unsigned long start_brk, brk, start_stack;
    unsigned long arg_start, arg_end, env_start, env_end;
    ...
};

Filesystem information ( fs_struct )

Current working directory and root are stored in fs_struct (see include/linux/fs_struct.h).

// file: include/linux/fs_struct.h
struct fs_struct {
    struct path root, pwd; // each path contains a vfsmount and dentry
};

Open file table ( files_struct )

Each task owns a files_struct that holds an array of struct file * pointers (file descriptors). The kernel allocates a fresh copy for a new process unless the CLONE_FILES flag is set.

// file: include/linux/fdtable.h
struct files_struct {
    int next_fd;               // next free descriptor
    struct fdtable __rcu *fdt; // pointer to the descriptor table
    ...
};

Namespaces ( nsproxy )

Namespaces isolate resources such as PID, mount points, and network stacks. The nsproxy pointer in task_struct links a task to its namespace set.

// file: include/linux/nsproxy.h
struct nsproxy {
    struct uts_namespace   *uts_ns;
    struct ipc_namespace   *ipc_ns;
    struct mnt_namespace   *mnt_ns;
    struct pid_namespace   *pid_ns;
    struct net             *net_ns;
    atomic_t count;
};

Decoding the fork system call

The user‑visible fork() entry is defined in kernel/fork.c:

// file: kernel/fork.c
SYSCALL_DEFINE0(fork) {
    return do_fork(SIGCHLD, 0, 0, NULL, NULL);
}
do_fork

creates a new task_struct by invoking copy_process, then wakes the child into the run queue.

// file: kernel/fork.c
long do_fork(unsigned long clone_flags, unsigned long stack_start,
            unsigned long stack_size, int __user *parent_tidptr,
            int __user *child_tidptr) {
    struct task_struct *p;
    p = copy_process(clone_flags, stack_start, stack_size,
                     child_tidptr, NULL, trace);
    wake_up_new_task(p);
    ...
}

copy_process – the heart of process creation

The function performs a series of deep copies and allocations:

Allocate a fresh task_struct via dup_task_struct.

Copy the parent’s files_struct (unless CLONE_FILES).

Copy the parent’s fs_struct (unless CLONE_FS).

Duplicate the address space ( mm_struct) unless CLONE_VM.

Copy namespace pointers ( nsproxy) unless a specific clone flag is set.

Allocate a new PID with alloc_pid and store it in pid / tgid.

Place the new task on the run queue with wake_up_new_task.

Duplicate task_struct

// file: kernel/fork.c
static struct task_struct *dup_task_struct(struct task_struct *orig) {
    struct task_struct *tsk = alloc_task_struct_node(node);
    arch_dup_task_struct(tsk, orig); // shallow copy of the struct
    return tsk;
}

Copy files_struct

// file: kernel/fork.c
static int copy_files(unsigned long clone_flags, struct task_struct *tsk) {
    struct files_struct *oldf = current->files, *newf;
    if (clone_flags & CLONE_FILES) {
        atomic_inc(&oldf->count);
        tsk->files = oldf;
        return 0;
    }
    newf = dup_fd(oldf, &error);
    tsk->files = newf;
    return 0;
}

Copy fs_struct

// file: kernel/fork.c
static int copy_fs(unsigned long clone_flags, struct task_struct *tsk) {
    struct fs_struct *fs = current->fs;
    if (clone_flags & CLONE_FS) {
        atomic_inc(&fs->users);
        tsk->fs = fs;
        return 0;
    }
    tsk->fs = copy_fs_struct(fs);
    return 0;
}

Copy mm_struct

// file: kernel/fork.c
static int copy_mm(unsigned long clone_flags, struct task_struct *tsk) {
    struct mm_struct *oldmm = current->mm, *mm;
    if (clone_flags & CLONE_VM) {
        atomic_inc(&oldmm->mm_users);
        mm = oldmm;
    } else {
        mm = dup_mm(tsk);
    }
    tsk->mm = mm;
    return 0;
}

Copy namespaces

// file: kernel/fork.c
static int copy_namespaces(unsigned long clone_flags, struct task_struct *tsk) {
    if (clone_flags & CLONE_NEWNS) {
        // allocate new namespace structures (omitted for brevity)
    } else {
        tsk->nsproxy = current->nsproxy; // share with parent
    }
    return 0;
}

Allocate PID

// file: kernel/pid.c
struct pid *alloc_pid(struct pid_namespace *ns) {
    struct pid *pid = kmem_cache_alloc(ns->pid_cachep, GFP_KERNEL);
    if (!pid)
        return NULL;
    pid->level = ns->level;
    for (i = ns->level; i >= 0; i--) {
        nr = alloc_pidmap(ns);
        pid->numbers[i].nr = nr;
    }
    return pid;
}

The kernel stores used PIDs in a bitmap inside each pid_namespace, allowing a single bit to represent the occupancy of a PID number.

Putting the new task on the run queue

// file: kernel/fork.c
wake_up_new_task(p);

When the scheduler selects the child, it begins execution at the entry point supplied by ngx_spawn_process (or the program’s main for a generic fork).

Summary

This walkthrough shows how a high‑level operation such as Nginx’s worker creation maps to low‑level kernel actions: the fork syscall invokes do_fork, which builds a fresh task_struct, duplicates essential resources (files, filesystem info, address space, namespaces), allocates a PID via a bitmap, and finally enqueues the child for scheduling. Understanding each step clarifies the purpose of the fields in task_struct and demonstrates the kernel’s efficient management of millions of processes.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

kernelLinuxprocessforktask_struct
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.