Understanding Compilation, Linking, and Loading of C Programs on Linux
On Linux, C source is turned into an ELF executable through compilation with gcc, linking of object files and libraries, and loading by the kernel, a process that depends on CPU architecture, binary formats, symbol tables, dynamic relocation, and startup code before main runs.
Modern software development often favors high‑level languages (Java, Python, PHP) that run on virtual machines or interpreters, leaving many engineers unfamiliar with the low‑level processes of compiling, linking, and loading native code.
This article explains the fundamental principles of how source code is transformed into an executable binary, covering CPU architectures, operating system binary formats, and the role of compilers, linkers, and loaders.
CPU Architecture
Most PCs and servers use the x86_64 (CISC) instruction set, while RISC architectures such as SPARC or PowerPC have different instruction encodings. Binary code compiled for one ISA cannot run on another because the instruction streams are incompatible.
Operating System Binary Formats
Different OSes use distinct executable formats: Windows uses PE, macOS uses Mach‑O, and Linux uses ELF. These formats dictate how the loader finds code, data, and initialization sections, which is why a Windows .exe cannot run on Linux and vice versa.
Source‑Code Compilation
On Linux, the GNU toolchain ( gcc / g++) compiles C/C++ source files into object files ( .o). The compiler translates high‑level constructs (declarations, definitions, static vs. non‑static variables, functions) into machine instructions and generates a symbol table.
Example C snippet used in the article:
int g_a = 1; // defined global variable with initializer
int g_b; // defined global variable without initializer
static int g_c; // defined static global variable
extern int g_x; // declaration of an external variable
extern int sub(); // function declaration
int sum(int m, int n) { return m+n; }
int main(int argc, char* argv[]) {
static int s_a = 0; // static local variable
int l_a = 0; // automatic local variable
sum(g_a,g_b);
return 0;
}Running gcc -c test.c -o test.o && nm test.o yields a symbol table showing addresses, section types (D, C, b, T, U), and names, illustrating how the compiler marks defined, undefined, and static symbols.
Linking Object Files
The linker resolves undefined symbols by searching other object files or libraries, merges sections of the same type, and relocates symbol addresses to absolute values. The article demonstrates linking test.o with test2.o (which defines g_x and sub) to produce a runnable ELF executable.
Dynamic Linking and Loading
Executable files may contain undefined symbols that refer to shared libraries (e.g., printf, strncpy). The linker creates PLT (Procedure Linkage Table) entries that jump through the GOT (Global Offset Table) so the runtime loader can resolve the actual addresses when the program starts.
Running ldd test3 shows the dependent libraries (e.g., libc.so.6) and their load addresses, which vary between executions due to address‑space randomization.
Program Startup Flow
The kernel loads the ELF file, jumps to the _start symbol, which calls __libc_start_main with pointers to __libc_csu_init and main. __libc_csu_init runs _init and global constructors before finally invoking main.
Performance Example
The article compares two equivalent C loops: one with the if outside the loop (Program 1) and one with the if inside the loop (Program 2). Disassembly shows that Program 1 performs a single branch before the loop, while Program 2 repeats the branch each iteration, leading to more pipeline stalls on modern CPUs. Consequently, Program 1 can be faster for tight loops, whereas Program 2 is cleaner and preferable when the code is not a bottleneck.
Conclusion
The piece provides a concise overview of the end‑to‑end process from C source to a running Linux process, emphasizing that compilers, linkers, and loaders work together to satisfy language semantics and OS requirements. The same principles apply to other languages that target ELF binaries (e.g., Go).
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Meituan Technology Team
Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
