Fundamentals 40 min read

Unlocking Linux: How ELF Files Transform Into Running Processes

This article explains the ELF file format, its various types, internal structure, compilation and linking steps, and how the Linux kernel loads ELF binaries, creates processes with fork and exec, handles dynamic linking, relocation, and builds the process address space, providing developers and system engineers with deep insight into Linux execution.

Deepin Linux
Deepin Linux
Deepin Linux
Unlocking Linux: How ELF Files Transform Into Running Processes

ELF Files: The Soul Container of Linux

ELF (Executable and Linkable Format) is the standard binary format on Linux, analogous to Windows .exe files, and includes executable files, relocatable object files, shared objects (.so), and core dump files.

What Is an ELF File?

ELF files can be executables, relocatable object files, shared libraries, or core dumps, each serving different purposes in development and debugging.

ELF "Identity Card"

The first four bytes of an ELF file are the magic number 0x7f454c46, which the kernel checks to verify the file format.

Internal Structure of an ELF File

The ELF file consists of several key components:

File Header : Contains magic number, file type, target architecture, entry point address, and offsets to program and section header tables.

Program Header Table : Describes loadable segments (type PT_LOAD) with their file offsets, virtual addresses, sizes, and permissions.

Section Header Table : Provides information for linkers and debuggers about each section, such as .text, .data, .bss, .rodata, .symtab, and .strtab.

Sections are the units stored in the file, while segments are the runtime memory units created from one or more related sections.

From ELF File to Linux Process

Compilation and Linking

Source code is first compiled into object files (.o) using a compiler like GCC, then linked either statically or dynamically to produce an ELF executable.

#include <stdio.h>
int main() {
    printf("Hello, World!
");
    return 0;
}

Static linking copies all required code into the final executable, while dynamic linking records references to shared libraries that are loaded at runtime.

Process Creation

Linux creates a new process with the fork() system call, which duplicates the parent process. The child then calls an exec family function (e.g., execve) to replace its memory image with the ELF binary.

#include <stdio.h>
#include <unistd.h>
int main() {
    pid_t pid = fork();
    if (pid == 0) {
        printf("Child PID %d
", getpid());
    } else {
        printf("Parent PID %d, child %d
", getpid(), pid);
    }
    return 0;
}

Loading the ELF File

When execve is invoked, the kernel reads the ELF header, verifies the magic number, and parses the program header table. For each PT_LOAD segment, the kernel maps the segment into the process's virtual address space using do_mmap, applying the appropriate permissions (read, write, execute). If a segment’s memory size exceeds its file size, the extra memory is zero‑initialized.

for (i = 0; i < phnum; i++) {
    if (phdr[i].p_type == PT_LOAD) {
        prot = 0;
        if (phdr[i].p_flags & PF_R) prot |= PROT_READ;
        if (phdr[i].p_flags & PF_W) prot |= PROT_WRITE;
        if (phdr[i].p_flags & PF_X) prot |= PROT_EXEC;
        do_mmap(file, phdr[i].p_offset, phdr[i].p_filesz, prot, MAP_PRIVATE, phdr[i].p_vaddr);
    }
}

Dynamic Linking

If the ELF file is dynamically linked, the kernel loads the interpreter specified in the .interp segment (e.g., /lib64/ld-linux-x86-64.so.2). The dynamic linker then resolves DT_NEEDED dependencies, loads required shared libraries, and performs lazy relocation using the Global Offset Table (GOT).

Process Address Space Layout

The loaded ELF defines the process’s memory layout: a read‑only executable code segment (.text), a writable data segment (.data), a zero‑filled BSS segment, a heap for dynamic allocation, a stack for function calls, and mapped regions for shared libraries.

Kernel Page Tables and MMU

The MMU translates the process’s virtual addresses to physical memory using page tables. Each page table entry maps a virtual page to a physical frame with appropriate access rights, enabling the CPU to fetch instructions and data from the correct locations.

Full Kernel Loading Flow

Initial ELF Inspection

The kernel opens the ELF file, reads the first 128 bytes to obtain the ELF header, and extracts the entry point, program header offset, and section header offset.

struct linux_binprm *bprm;
kernel_read(bprm->file, bprm->buf, BINPRM_BUF_SIZE, &pos);

Mapping Segments

Using the program header table, the kernel maps each PT_LOAD segment into memory with do_mmap, setting protections based on PF_R, PF_W, and PF_X flags.

Loading the Dynamic Interpreter

The .interp segment provides the path to the dynamic linker, which the kernel loads similarly to a regular ELF.

if (phdr[i].p_type == PT_INTERP) {
    // load dynamic linker
}

Relocation and Symbol Resolution

The dynamic linker parses the .dynamic section, loads needed shared libraries, and applies relocations. Lazy binding defers symbol resolution until the first call.

Program Initialization and Start

After all segments and shared libraries are loaded and relocated, the kernel transfers control to the entry point address, and the program begins execution.

Practical Example

A simple C program that sums two numbers is compiled with gcc -o sum sum.c. Using readelf reveals the ELF header, program headers, and section headers, confirming the layout described above. Tracing execution with strace -f -o sum_trace.txt ./sum shows system calls such as fork(), execve(), and openat() for loading libc.so, illustrating the runtime behavior of ELF loading and process creation.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

LinuxELFprocesslinkingBinary FormatExecutable
Deepin Linux
Written by

Deepin Linux

Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.