Unveiling Linux Process Creation: How Nginx Forks Workers and the Kernel Builds task_struct
This article provides a deep, step‑by‑step exploration of Linux process creation, using Nginx’s worker forking as a concrete example, and walks through the task_struct layout, process states, PID management, address‑space handling, file system structures, namespaces, and the internal logic of the fork system call.
Nginx creates worker processes with fork
The Nginx master process spawns a configurable number of workers by looping over ngx_spawn_process in src/os/unix/ngx_process_cycle.c:
// file: src/os/unix/ngx_process_cycle.c
static void ngx_start_worker_processes(...){
for (i = 0; i < n; i++) {
ngx_spawn_process(cycle, ngx_worker_process_cycle,
(void *)(intptr_t)i, "worker process", type);
}
} ngx_spawn_process(see src/os/unix/ngx_process.c) simply calls fork() and, on success, runs the worker entry function:
// file: src/os/unix/ngx_process.c
ngx_pid_t ngx_spawn_process(ngx_cycle_t *cycle, ngx_spawn_proc_pt proc, ...){
pid = fork();
switch (pid) {
case -1: /* error */ ...
case 0: /* child */
proc(cycle, data);
break;
...
}
...
}Linux internal representation of a process
Every Linux task is described by struct task_struct defined in include/linux/sched.h. The most relevant fields are:
// file: include/linux/sched.h
struct task_struct {
volatile long state; // process state flags
pid_t pid; // thread PID
pid_t tgid; // thread‑group ID
struct task_struct __rcu *parent; // parent pointer
struct list_head children; // child list
struct list_head sibling; // sibling list
struct task_struct *group_leader; // leader of thread group
int prio, static_prio, normal_prio; // scheduling priorities
unsigned int rt_priority; // real‑time priority
struct mm_struct *mm, *active_mm; // address space
struct fs_struct *fs; // cwd, root
struct files_struct *files; // open file table
struct nsproxy *nsproxy; // namespaces
...
};Process state
The state field holds flag values such as TASK_RUNNING, TASK_INTERRUPTIBLE, TASK_UNINTERRUPTIBLE, etc., defined in include/linux/sched.h:
#define TASK_RUNNING 0
#define TASK_INTERRUPTIBLE 1
#define TASK_UNINTERRUPTIBLE 2
#define __TASK_STOPPED 4
#define __TASK_TRACED 8
/* ... */
#define TASK_DEAD 64
#define TASK_WAKEKILL 128
#define TASK_WAKING 256
#define TASK_PARKED 512
#define TASK_STATE_MAX 1024PID and thread group
Each task has a unique pid. For a single‑threaded process pid and tgid are identical. PIDs are allocated from a bitmap inside the PID namespace, which minimizes memory usage.
Process tree
The parent, children and sibling pointers form a tree rooted at the init process. Tools like pstree visualize this hierarchy.
Scheduling priority
static_prio : static priority (range 100‑139, set via nice)
rt_priority : real‑time priority (0‑99)
prio : dynamic priority used by the scheduler
normal_prio : derived from static priority and scheduling policy
Address space ( mm_struct )
The mm_struct (defined in include/linux/mm_types.h) describes a user process’s virtual memory layout:
// file: include/linux/mm_types.h
struct mm_struct {
struct vm_area_struct *mmap; // list of VMAs
struct rb_root mm_rb;
unsigned long mmap_base;
unsigned long task_size;
unsigned long start_code, end_code;
unsigned long start_data, end_data;
unsigned long start_brk, brk, start_stack;
unsigned long arg_start, arg_end, env_start, env_end;
...
};Filesystem information ( fs_struct )
Current working directory and root are stored in fs_struct (see include/linux/fs_struct.h).
// file: include/linux/fs_struct.h
struct fs_struct {
struct path root, pwd; // each path contains a vfsmount and dentry
};Open file table ( files_struct )
Each task owns a files_struct that holds an array of struct file * pointers (file descriptors). The kernel allocates a fresh copy for a new process unless the CLONE_FILES flag is set.
// file: include/linux/fdtable.h
struct files_struct {
int next_fd; // next free descriptor
struct fdtable __rcu *fdt; // pointer to the descriptor table
...
};Namespaces ( nsproxy )
Namespaces isolate resources such as PID, mount points, and network stacks. The nsproxy pointer in task_struct links a task to its namespace set.
// file: include/linux/nsproxy.h
struct nsproxy {
struct uts_namespace *uts_ns;
struct ipc_namespace *ipc_ns;
struct mnt_namespace *mnt_ns;
struct pid_namespace *pid_ns;
struct net *net_ns;
atomic_t count;
};Decoding the fork system call
The user‑visible fork() entry is defined in kernel/fork.c:
// file: kernel/fork.c
SYSCALL_DEFINE0(fork) {
return do_fork(SIGCHLD, 0, 0, NULL, NULL);
} do_forkcreates a new task_struct by invoking copy_process, then wakes the child into the run queue.
// file: kernel/fork.c
long do_fork(unsigned long clone_flags, unsigned long stack_start,
unsigned long stack_size, int __user *parent_tidptr,
int __user *child_tidptr) {
struct task_struct *p;
p = copy_process(clone_flags, stack_start, stack_size,
child_tidptr, NULL, trace);
wake_up_new_task(p);
...
}copy_process – the heart of process creation
The function performs a series of deep copies and allocations:
Allocate a fresh task_struct via dup_task_struct.
Copy the parent’s files_struct (unless CLONE_FILES).
Copy the parent’s fs_struct (unless CLONE_FS).
Duplicate the address space ( mm_struct) unless CLONE_VM.
Copy namespace pointers ( nsproxy) unless a specific clone flag is set.
Allocate a new PID with alloc_pid and store it in pid / tgid.
Place the new task on the run queue with wake_up_new_task.
Duplicate task_struct
// file: kernel/fork.c
static struct task_struct *dup_task_struct(struct task_struct *orig) {
struct task_struct *tsk = alloc_task_struct_node(node);
arch_dup_task_struct(tsk, orig); // shallow copy of the struct
return tsk;
}Copy files_struct
// file: kernel/fork.c
static int copy_files(unsigned long clone_flags, struct task_struct *tsk) {
struct files_struct *oldf = current->files, *newf;
if (clone_flags & CLONE_FILES) {
atomic_inc(&oldf->count);
tsk->files = oldf;
return 0;
}
newf = dup_fd(oldf, &error);
tsk->files = newf;
return 0;
}Copy fs_struct
// file: kernel/fork.c
static int copy_fs(unsigned long clone_flags, struct task_struct *tsk) {
struct fs_struct *fs = current->fs;
if (clone_flags & CLONE_FS) {
atomic_inc(&fs->users);
tsk->fs = fs;
return 0;
}
tsk->fs = copy_fs_struct(fs);
return 0;
}Copy mm_struct
// file: kernel/fork.c
static int copy_mm(unsigned long clone_flags, struct task_struct *tsk) {
struct mm_struct *oldmm = current->mm, *mm;
if (clone_flags & CLONE_VM) {
atomic_inc(&oldmm->mm_users);
mm = oldmm;
} else {
mm = dup_mm(tsk);
}
tsk->mm = mm;
return 0;
}Copy namespaces
// file: kernel/fork.c
static int copy_namespaces(unsigned long clone_flags, struct task_struct *tsk) {
if (clone_flags & CLONE_NEWNS) {
// allocate new namespace structures (omitted for brevity)
} else {
tsk->nsproxy = current->nsproxy; // share with parent
}
return 0;
}Allocate PID
// file: kernel/pid.c
struct pid *alloc_pid(struct pid_namespace *ns) {
struct pid *pid = kmem_cache_alloc(ns->pid_cachep, GFP_KERNEL);
if (!pid)
return NULL;
pid->level = ns->level;
for (i = ns->level; i >= 0; i--) {
nr = alloc_pidmap(ns);
pid->numbers[i].nr = nr;
}
return pid;
}The kernel stores used PIDs in a bitmap inside each pid_namespace, allowing a single bit to represent the occupancy of a PID number.
Putting the new task on the run queue
// file: kernel/fork.c
wake_up_new_task(p);When the scheduler selects the child, it begins execution at the entry point supplied by ngx_spawn_process (or the program’s main for a generic fork).
Summary
This walkthrough shows how a high‑level operation such as Nginx’s worker creation maps to low‑level kernel actions: the fork syscall invokes do_fork, which builds a fresh task_struct, duplicates essential resources (files, filesystem info, address space, namespaces), allocates a PID via a bitmap, and finally enqueues the child for scheduling. Understanding each step clarifies the purpose of the fields in task_struct and demonstrates the kernel’s efficient management of millions of processes.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
