Fundamentals 29 min read

Understanding Compilation, Linking, and Loading of C Programs on Linux

On Linux, C source is turned into an ELF executable through compilation with gcc, linking of object files and libraries, and loading by the kernel, a process that depends on CPU architecture, binary formats, symbol tables, dynamic relocation, and startup code before main runs.

Meituan Technology Team
Meituan Technology Team
Meituan Technology Team
Understanding Compilation, Linking, and Loading of C Programs on Linux

Modern software development often favors high‑level languages (Java, Python, PHP) that run on virtual machines or interpreters, leaving many engineers unfamiliar with the low‑level processes of compiling, linking, and loading native code.

This article explains the fundamental principles of how source code is transformed into an executable binary, covering CPU architectures, operating system binary formats, and the role of compilers, linkers, and loaders.

CPU Architecture

Most PCs and servers use the x86_64 (CISC) instruction set, while RISC architectures such as SPARC or PowerPC have different instruction encodings. Binary code compiled for one ISA cannot run on another because the instruction streams are incompatible.

Operating System Binary Formats

Different OSes use distinct executable formats: Windows uses PE, macOS uses Mach‑O, and Linux uses ELF. These formats dictate how the loader finds code, data, and initialization sections, which is why a Windows .exe cannot run on Linux and vice versa.

Source‑Code Compilation

On Linux, the GNU toolchain ( gcc / g++) compiles C/C++ source files into object files ( .o). The compiler translates high‑level constructs (declarations, definitions, static vs. non‑static variables, functions) into machine instructions and generates a symbol table.

Example C snippet used in the article:

int g_a = 1;            // defined global variable with initializer
int g_b;                 // defined global variable without initializer
static int g_c;         // defined static global variable
extern int g_x;         // declaration of an external variable
extern int sub();       // function declaration

int sum(int m, int n) { return m+n; }

int main(int argc, char* argv[]) {
    static int s_a = 0; // static local variable
    int l_a = 0;        // automatic local variable
    sum(g_a,g_b);
    return 0;
}

Running gcc -c test.c -o test.o && nm test.o yields a symbol table showing addresses, section types (D, C, b, T, U), and names, illustrating how the compiler marks defined, undefined, and static symbols.

Linking Object Files

The linker resolves undefined symbols by searching other object files or libraries, merges sections of the same type, and relocates symbol addresses to absolute values. The article demonstrates linking test.o with test2.o (which defines g_x and sub) to produce a runnable ELF executable.

Dynamic Linking and Loading

Executable files may contain undefined symbols that refer to shared libraries (e.g., printf, strncpy). The linker creates PLT (Procedure Linkage Table) entries that jump through the GOT (Global Offset Table) so the runtime loader can resolve the actual addresses when the program starts.

Running ldd test3 shows the dependent libraries (e.g., libc.so.6) and their load addresses, which vary between executions due to address‑space randomization.

Program Startup Flow

The kernel loads the ELF file, jumps to the _start symbol, which calls __libc_start_main with pointers to __libc_csu_init and main. __libc_csu_init runs _init and global constructors before finally invoking main.

Performance Example

The article compares two equivalent C loops: one with the if outside the loop (Program 1) and one with the if inside the loop (Program 2). Disassembly shows that Program 1 performs a single branch before the loop, while Program 2 repeats the branch each iteration, leading to more pipeline stalls on modern CPUs. Consequently, Program 1 can be faster for tight loops, whereas Program 2 is cleaner and preferable when the code is not a bottleneck.

Conclusion

The piece provides a concise overview of the end‑to‑end process from C source to a running Linux process, emphasizing that compilers, linkers, and loaders work together to satisfy language semantics and OS requirements. The same principles apply to other languages that target ELF binaries (e.g., Go).

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Compilationc++LinuxELFloaderBinaryLinker
Meituan Technology Team
Written by

Meituan Technology Team

Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.