What Makes a Linux Process Tick? Deep Dive into Creation, Execution, and Termination
This article explains the fundamental concepts of Linux processes, their relationship to programs, threads, kernels, and memory, details the internal task_struct implementation, and walks through the full lifecycle from creation with fork, loading via execve, execution, and termination including exit_group and zombie handling.
1. Basic Concepts of Processes
A process is the dynamic execution instance of a program, residing in memory and disappearing when the system powers off, while the program file remains on storage. In UNIX‑like systems the common executable format is ELF, typically compiled from C or C++ source.
1.1 Process vs. Program
Programs are static files; processes are the live, in‑memory execution of those files. Loading a program into memory creates a process.
1.2 Process vs. Thread
Historically a process contained a single execution flow (the main thread). Modern systems use threads to provide multiple concurrent flows within one process. Threads can be implemented in kernel space (kernel‑level threads, also called lightweight processes) or user space (user‑level threads). Kernel‑level threads are visible to the scheduler; user‑level threads are managed by a library.
1.3 Process vs. Kernel
The kernel runs in privileged mode in its own address space, while processes run in user space. System calls provide a controlled interface for user processes to request kernel services.
1.4 Process vs. Memory
Processes see a contiguous virtual address space; the kernel maps this to physical memory transparently. Allocation functions like brk, sbrk, and mmap request virtual memory, while the C library provides malloc / free for convenience.
1.5 Process States
Linux defines several states: runnable (TASK_RUNNING), interruptible sleep, uninterruptible sleep, stopped, traced, dead, and exit states (EXIT_ZOMBIE, EXIT_DEAD). A newly created process starts in the non‑persistent "new" state, quickly moves to "ready", then cycles through running, blocked, and finally exits.
1.6 Process Relationships
All processes form a parent‑child tree rooted at the init process (PID 1). The zero‑process (PID 0) is the idle task. Threads share the same PID/TGID within a process, and groups such as sessions and process groups help manage collections of processes for job control.
2. Implementation of Processes in Linux
Linux uses a single task_struct to represent both processes and threads. Historically Linux only supported a single thread per process, so task_struct acted as the process control block. When multithreading was added, the same structure was reused, with separate fields for thread‑specific and process‑specific data.
2.1 Basic Principle
The kernel does not have distinct process and thread structures; instead, each thread (including the main thread) is a task_struct. All threads of the same process share pointers to common resources such as mm_struct (virtual memory), files_struct (open files), and signal_struct (signal handling).
2.2 The task_struct Layout
struct task_struct {
#ifdef CONFIG_THREAD_INFO_IN_TASK
struct thread_info thread_info;
#endif
unsigned int __state;
void *stack;
unsigned int flags;
int on_cpu;
unsigned int cpu;
int prio;
int static_prio;
int normal_prio;
unsigned int rt_priority;
const struct sched_class *sched_class;
struct sched_entity se;
struct sched_rt_entity rt;
struct sched_dl_entity dl;
unsigned int policy;
int nr_cpus_allowed;
cpumask_t cpus_mask;
struct sched_info sched_info;
struct list_head tasks;
struct mm_struct *mm;
struct mm_struct *active_mm;
struct vmacache vmacache;
int exit_state;
int exit_code;
int exit_signal;
pid_t pid;
pid_t tgid;
struct task_struct __rcu *real_parent;
struct task_struct __rcu *parent;
struct list_head children;
struct list_head sibling;
struct task_struct *group_leader;
unsigned long nvcsw;
unsigned long nivcsw;
u64 start_time;
u64 start_boottime;
unsigned long min_flt;
unsigned long maj_flt;
char comm[TASK_COMM_LEN];
struct fs_struct *fs;
struct files_struct *files;
struct signal_struct *signal;
struct sighand_struct __rcu *sighand;
sigset_t blocked;
sigset_t real_blocked;
sigset_t saved_sigmask;
struct sigpending pending;
struct thread_struct thread;
};Key fields: mm and active_mm point to the process's virtual memory. files manages the file descriptor table. signal and sighand handle signal state (process‑wide vs. thread‑specific).
State fields like __state, exit_state, and scheduling fields ( prio, rt_priority) drive the scheduler.
2.3 Process Identifiers (PID and TGID)
In Linux the kernel thread ID (tid) is the pid field of task_struct. The thread group ID ( tgid) is the PID of the first thread (the main thread) of the process. User‑space sees the tgid as the process ID, while pid may differ for each thread.
2.4 Process State Flags
#define TASK_RUNNING 0x0000
#define TASK_INTERRUPTIBLE 0x0001
#define TASK_UNINTERRUPTIBLE 0x0002
#define __TASK_STOPPED 0x0004
#define __TASK_TRACED 0x0008
#define EXIT_DEAD 0x0010
#define EXIT_ZOMBIE 0x0020
#define EXIT_TRACE (EXIT_ZOMBIE | EXIT_DEAD)
#define TASK_PARKED 0x0040
#define TASK_DEAD 0x0080
#define TASK_WAKEKILL 0x0100
#define TASK_WAKING 0x0200
#define TASK_NOLOAD 0x0400
#define TASK_NEW 0x0800These constants are used in task_struct.__state and task_struct.exit_state to represent the current execution status.
3. Process Lifecycle
3.1 Process Creation (fork)
Linux separates process creation ( fork) from program execution ( execve). fork clones the calling task, returning the child's PID to the parent and 0 to the child.
#include <unistd.h>
pid_t fork(void);Example:
#include <stdio.h>
#include <sys/wait.h>
#include <unistd.h>
#include <stdlib.h>
int main(int argc, char *argv[]){
pid_t pid = fork();
if(pid == -1){
printf("fork error, exit
");
exit(-1);
} else if(pid == 0){
printf("I am child process, pid:%d
", getpid());
pause();
} else {
printf("I am parent process, pid:%d, my child is pid:%d
", getpid(), pid);
waitpid(pid, NULL, 0);
}
}The kernel implements fork via kernel_clone, which ultimately calls copy_process to duplicate the task_struct and associated resources.
SYSCALL_DEFINE0(fork)
{
struct kernel_clone_args args = { .exit_signal = SIGCHLD };
return kernel_clone(&args);
}3.2 Process Loading (execve)
After fork, the child typically calls execve to replace its memory image with a new program.
#include <unistd.h>
int execve(const char *pathname, char *const argv[], char *const envp[]);The kernel parses the ELF header, loads program segments, sets up the stack, and finally transfers control to the entry point.
SYSCALL_DEFINE3(execve,
const char __user *filename,
const char __user *const __user *argv,
const char __user *const __user *envp)
{
return do_execve(getname(filename), argv, envp);
}Key steps inside do_execve include allocating a linux_binprm structure, counting arguments, loading the ELF binary (or interpreter), setting up memory mappings, and finally invoking START_THREAD to jump to user space.
3.3 Interpreter Loading (Dynamic Linking)
If the ELF file specifies an interpreter (e.g., /lib64/ld-linux-x86-64.so.2), the kernel loads that binary first. The interpreter then loads all required shared objects ( .so files) by parsing the .dynamic section, mapping each segment, and performing relocations.
3.4 Process Initialization
After the binary and interpreter are mapped, the kernel calls _start, which sets up the runtime environment and invokes __libc_start_main. This function performs libc initialization and finally calls the user’s main function.
3.5 Process Execution
During normal execution the process repeatedly moves between the runnable, running, and blocked states as scheduled by the kernel.
3.6 Process Termination
When a process finishes (e.g., return from main, calls exit, or receives a fatal signal), it enters the EXIT_ZOMBIE state. The parent must reap it with wait / waitpid to transition to EXIT_DEAD.
System‑call implementation of exit_group (used by exit and _exit) terminates the calling thread and signals all other threads in the same thread group.
SYSCALL_DEFINE1(exit_group, int, error_code)
{
do_group_exit((error_code & 0xff) << 8);
return 0; /* NOTREACHED */
}The core of do_exit cleans up resources: memory, file descriptors, signal handlers, accounting, and finally removes the task from the scheduler.
void __noreturn do_exit(long code)
{
struct task_struct *tsk = current;
// ... various sanity checks ...
exit_signals(tsk); // set PF_EXITING
if (tsk->mm)
sync_mm_rss(tsk->mm);
// release resources
exit_mm();
exit_files(tsk);
exit_fs(tsk);
exit_thread(tsk);
// notify parent
exit_notify(tsk, group_dead);
do_task_dead();
}4. Review and Summary
Linux treats processes and threads uniformly via task_struct. The first thread creates the process and its shared resources; the last thread’s exit frees those resources. Process creation, execution, and termination involve a series of well‑defined system calls ( fork, execve, exit_group) and kernel data structures that manage memory, files, signals, and scheduling.
References: Linux Kernel Development, Understanding the Linux Kernel, The Linux Programming Interface, Advanced Programming in the UNIX Environment, Linkers & Loaders, and various man‑pages (fork, execve, exit, wait).
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
