Mastering the GCC Toolchain: From Preprocessing to ELF Analysis
This guide explains how high‑level C/C++ source code is transformed into executable binary code on Linux using the GCC toolchain, covering preprocessing, compilation, assembly, linking, and ELF file inspection with practical command examples.
Introduction
High‑level languages such as C and C++ must be translated into machine code before execution. On Linux the GNU Compiler Collection (GCC) together with Binutils performs this translation, producing an ELF executable that can be inspected with various utilities.
GCC Toolchain Overview
GCC
GCC drives the compilation pipeline: preprocessing, compilation to assembly, assembly to object files, and linking.
Binutils
Binutils provides the binary utilities addr2line, ar, as, ld, objcopy, objdump, readelf, size, and ldd, which are essential for assembling, linking and examining binaries.
C Runtime Library
The C runtime library (CRT) supplies the actual implementations of the functions declared in standard headers such as <stdio.h>. C++ has a similar runtime library.
Preparation
All commands assume a Linux environment. The example program is a minimal "Hello World" application:
#include <stdio.h>
int main(void) {
printf("Hello World!
");
return 0;
}Compilation Process
1. Preprocessing
The preprocessor expands macros, resolves #include directives, removes comments, and inserts line markers for debugging.
$ gcc -E hello.c -o hello.i # -E stops after preprocessingSample fragment of hello.i (escaped angle brackets):
// hello.i fragment
extern void funlockfile (FILE *__stream) __attribute__ ((__nothrow__, __leaf__));
# 942 "/usr/include/stdio.h" 3 4
# 2 "hello.c" 2
int main(void) {
printf("Hello World!
");
return 0;
}2. Compilation
Compilation translates the preprocessed source into assembly language after lexical, syntactic and semantic analysis.
$ gcc -S hello.i -o hello.s # -S stops after generating assemblyExample assembly fragment:
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
movq %rsp, %rbp
movl $.LC0, %edi
call puts
movl $0, %eax
popq %rbp
ret
.cfi_endproc3. Assembly
The assembler converts the assembly code into an ELF relocatable object file ( .o).
$ gcc -c hello.s -o hello.o # -c stops after assembling
# or directly using the assembler
$ as -c hello.s -o hello.o4. Linking
Linking combines one or more object files and libraries into a final executable. Both static and dynamic linking are demonstrated.
Static linking copies code from static libraries ( .a) into the executable, increasing its size.
Dynamic linking records references to shared libraries ( .so) that are loaded at runtime.
Typical dynamic‑link search order on Linux: command‑line -L paths → LIBRARY_PATH → default directories /lib, /usr/lib, /usr/local/lib. Runtime search order adds LD_LIBRARY_PATH and entries from /etc/ld.so.conf.
# Dynamic linking (default)
$ gcc hello.c -o hello
$ size hello
text data bss dec hex filename
1183 552 8 1743 6cf hello
$ ldd hello
linux-vdso.so.1 => (0x00007fffefd7c000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fadcdd82000)
/lib64/ld-linux-x86-64.so.2 (0x00007fadce14c000)
# Static linking
$ gcc -static hello.c -o hello
$ size hello
text data bss dec hex filename
823726 7284 6360 837370 cc6fa hello
$ ldd hello
not a dynamic executableELF File Analysis
1. ELF Sections
An ELF executable consists of several sections, each serving a distinct purpose: .text – executable code .rodata – read‑only data (constants) .data – initialized global/static variables .bss – uninitialized global/static variables .debug – debugging symbols
Listing sections with readelf -S:
$ readelf -S hello
There are 31 section headers, starting at offset 0x19d8:
[ 0] NULL ...
[11] .init PROGBITS ...
[14] .text PROGBITS ...
[15] .fini PROGBITS ...
...2. Disassembling ELF
Because ELF files are binary, objdump is used to view the machine instructions.
$ objdump -D hello
0000000000400526 <main>:
55 push %rbp
48 89 e5 mov %rsp,%rbp
bf c4 05 40 00 mov $0x4005c4,%edi
e8 cc fe ff ff callq 400400 <puts@plt>
b8 00 00 00 00 mov $0x0,%eax
5d pop %rbp
c3 retqCombining source with disassembly (debug build):
$ gcc -g -o hello hello.c
$ objdump -S hello
0000000000400526 <main>:
# include <stdio.h>
int main(void) {
55 push %rbp
48 89 e5 mov %rsp,%rbp
printf("Hello World!
");
bf c4 05 40 00 mov $0x4005c4,%edi
e8 cc fe ff ff callq 400400 <puts@plt>
return 0;
b8 00 00 00 00 mov $0x0,%eax
}
5d pop %rbp
c3 retqThis walkthrough provides the essential commands and concepts required to compile C/C++ programs on Linux, understand each GCC component’s role, and inspect the resulting ELF binaries.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
