Inside Linux’s fork: How the Kernel Creates a New Process
This article dissects the Linux kernel’s fork implementation, explaining why the child receives a return value of 0, how parent and child share code execution, the internal steps of sys_fork, kernel_clone, copy_process, thread handling, scheduling initialization, and the final insertion of the new task into the run‑queue.
Background
When a command such as ls -l is run in a Linux bash shell, the shell forks a new process to execute the command. The article investigates what the kernel does during this fork operation.
Key Questions
Why does fork return 0 in the newly created process?
How do the parent and child execute the same code?
When is the new process added to the scheduler’s queue?
User‑Space Fork
Calling fork() creates a child process whose pid is reported as 0 to the child, allowing the child to continue execution while the parent proceeds with the original command.
Kernel Entry
The user‑space fork triggers the system call sys_fork, which forwards to kernel_clone. This function returns a pid_t value representing the new process identifier.
copy_process – Core Operation
The kernel then calls copy_process, which performs the essential work of duplicating the parent’s resources. Key steps include:
Allocate memory for task_struct and thread_info.
Copy the parent’s task_struct fields (pid, tgid, policy, state, etc.).
Set up the kernel stack and associate thread_info (8 KB) with the new task.
Important Flags
The SIGNAL_UNKILLABLE flag, set by Android’s init process, prevents the new task from becoming a sibling of init; it can only be a child.
dup_task_struct
During copy_process, dup_task_struct creates a copy of the parent’s task_struct, which holds the process control block (PCB) containing identifiers, scheduling policy, state, memory descriptors, file tables, and more.
Thread Information and Kernel Stack
The kernel allocates an 8 KB kernel stack and places a small thread_info structure at its base. This layout saves memory by co‑locating the stack and thread metadata.
copy_thread – Register Setup
copy_threadsets the child’s instruction pointer to ret_from_fork and initializes registers so that the child returns 0 from fork and then continues execution in user space.
ret_from_fork – Register Restoration
The ret_from_fork routine restores segment registers (CS, SS, etc.) and transitions the child back to user mode.
Global PID and Namespace Handling
The child’s PID is allocated from the global PID namespace, while tgid, pgid, and sid are set accordingly and linked into the appropriate kernel lists.
Scheduling Initialization
After copy_process finishes, wake_up_new_task is called. This function invokes activate_task, which eventually calls the CFS scheduler’s enqueue_task_fair to place the new task into the red‑black tree run‑queue.
Conclusion
The first part of the “fork journey” has mapped the major kernel steps from user‑space request to scheduler insertion, while many deeper details remain for further exploration, such as ELF loading and shared‑object handling in the next installment.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
OPPO Kernel Craftsman
Sharing Linux kernel-related cutting-edge technology, technical articles, technical news, and curated tutorials
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
