Fundamentals 16 min read

How Does GCC Turn C Code into Executable Binaries? A Step‑by‑Step Guide

This article explains the complete transformation of C/C++ source code into processor‑executable binary files using the GCC toolchain, covering preprocessing, compilation, assembly, linking, ELF structure, and practical command‑line examples on a Linux system.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
How Does GCC Turn C Code into Executable Binaries? A Step‑by‑Step Guide

Computer programming languages are generally divided into machine language, assembly language, and high‑level languages. High‑level languages must be translated into machine code, either by a compiler (e.g., C, C++, Java) or an interpreter (e.g., Python, Ruby, MATLAB, JavaScript).

The process of converting a C/C++ program into binary code that a processor can execute consists of four steps: preprocessing, compilation, assembly, and linking.

Preprocessing (Preprocessing)

Compilation (Compilation)

Assembly (Assembly)

Linking (Linking)

GCC Toolchain Overview

GCC (GNU Compiler Collection) is a widely used compilation suite on Linux. The GCC toolchain includes GCC, Binutils, and the C runtime library.

GCC

GCC (GNU C Compiler) performs the compilation of C/C++ source files into executable binaries.

Binutils

Binutils is a collection of binary handling tools such as addr2line, ar, objcopy, objdump, as, ld, ldd, readelf, and size. Brief descriptions:

addr2line : Converts program addresses to source file names and line numbers.

as : Assembler for converting assembly code to machine instructions.

ld : Linker for combining object files into executables.

ar : Creates static libraries; also used to understand static vs. dynamic libraries.

ldd : Lists shared libraries required by an executable.

objcopy : Converts object files between formats (e.g., .bin ↔ .elf).

objdump : Disassembles object files.

readelf : Displays ELF file information.

size : Shows the size of each section in an executable.

C Runtime Library

The C standard defines syntax and a standard library of header files (e.g., stdio.h with printf). The actual implementations are provided by a C Runtime Library (CRT), which the compiler links against. C++ has a similar C++ Runtime Library.

Preparation

Because the GCC toolchain is primarily used on Linux, the examples assume a Linux environment. A simple Hello World program is used as the demonstration source:

#include <stdio.h>

// Simple program that prints a string
int main(void)
{
    printf("Hello World! 
");
    return 0;
}

Compilation Process

1. Preprocessing

Preprocessing expands macros, processes #include directives, removes comments, adds line numbers for debugging, and retains #pragma directives.

Command:

$ gcc -E hello.c -o hello.i   // Stop after preprocessing

The resulting hello.i file can be inspected; a fragment looks like:

// hello.i fragment
extern void funlockfile (FILE *__stream) __attribute__ ((__nothrow__ , __leaf__));
# 942 "/usr/include/stdio.h" 3 4
# 2 "hello.c" 2
# 3 "hello.c"
int
main(void)
{
  printf("Hello World! 
");
  return 0;
}

2. Compilation

The compiler performs lexical, syntax, and semantic analysis on the preprocessed file and generates assembly code.

Command:

$ gcc -S hello.i -o hello.s   // Stop after compilation, produce assembly

Sample assembly fragment:

// hello.s fragment
main:
.LFB0:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    movl    $.LC0, %edi
    call    puts
    movl    $0, %eax
    popq    %rbp
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc

3. Assembly

The assembler translates assembly instructions into machine code, producing object files with a .o suffix.

Command:

$ gcc -c hello.s -o hello.o   // Assemble to object file
# or directly use as
$ as -c hello.s -o hello.o

The resulting hello.o is an ELF relocatable object.

4. Linking

Linking combines object files and libraries into a final executable. It can be static or dynamic.

Static linking incorporates library code into the executable, producing a larger file.

Dynamic linking records references to shared libraries, which are loaded at runtime.

Linker search order on Linux: paths specified by -L, LIBRARY_PATH, then default /lib, /usr/lib, /usr/local/lib. Runtime search order adds LD_LIBRARY_PATH and /etc/ld.so.conf.

Dynamic linking example:

$ gcc hello.c -o hello
$ size hello
   text    data     bss    dec    hex filename
   1183     552       8   1743   6cf hello
$ ldd hello
    linux-vdso.so.1 =>  (0x00007fffefd7c000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fadcdd82000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fadce14c000)

Static linking example:

$ gcc -static hello.c -o hello
$ size hello
   text    data     bss    dec    hex filename
 823726    7284    6360 837370  cc6fa hello
$ ldd hello
    not a dynamic executable

The final ELF executable contains sections such as .text, .rodata, .data, .bss, and .debug.

Analyzing ELF Files

1. ELF Sections

An ELF file consists of a header, section header table, and a series of sections. Common sections include:

.text – executable code

.rodata – read‑only data (constants)

.data – initialized global/static variables

.bss – uninitialized global/static variables

.debug – debugging symbols

Section information can be displayed with:

$ readelf -S hello
There are 31 section headers, starting at offset 0x19d8:
[ 0] ...
[11] .init      PROGBITS ...
[14] .text      PROGBITS ...
[15] .fini      PROGBITS ...

2. Disassembling ELF

Because ELF files are binary, they must be disassembled to view instructions. Use objdump -D:

$ objdump -D hello
0000000000400526 <main>:
  400526: 55                push   %rbp
  400527: 48 89 e5          mov    %rsp,%rbp
  40052a: bf c4 05 40 00    mov    $0x4005c4,%edi
  40052f: e8 cc fe ff ff    callq  400400 <puts@plt>
  400534: b8 00 00 00 00    mov    $0x0,%eax
  400539: 5d                pop    %rbp
  40053a: c3                retq

To intermix source code with disassembly, use objdump -S after compiling with -g:

$ gcc -o hello -g hello.c
$ objdump -S hello
0000000000400526 <main>:
#include <stdio.h>
int main(void)
{...
  400526: 55                push   %rbp
  400527: 48 89 e5          mov    %rsp,%rbp
  printf("Hello World! 
");
  40052a: bf c4 05 40 00    mov    $0x4005c4,%edi
  40052f: e8 cc fe ff ff    callq  400400 <puts@plt>
  return 0;
  400534: b8 00 00 00 00    mov    $0x0,%eax
}
  400539: 5d                pop    %rbp
  40053a: c3                retq

These commands illustrate the full transformation from high‑level C source to machine‑level ELF executable.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

CompilationLinuxELFC programminglinkinggcc
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.