What Happens Inside a Hello World Program? Unveiling Object Files, Linking, and Loading
This article explains how a simple Hello World program is transformed from source code into machine code, detailing the roles of object files, sections, linking (static and dynamic), and loading, while showing practical examples with GCC, objdump, and readelf on Linux.
Introduction
Every programming language starts with a "Hello World" program, but many developers are unaware of what actually happens inside the CPU when that program runs.
Hidden Process of Development Platforms
Compiling source code into an executable involves several hidden steps that IDEs abstract away. In simplified terms the process can be divided into three stages: (1) translate source code to machine code forming an intermediate file (File A), (2) link File A with required libraries (File B) to produce a combined file (File A+), and (3) load the combined file into memory for execution.
Object Files
"Any problem in computer science can be solved by adding another layer of indirection." – Chinese proverb
Object files are intermediate binary files that store compiled code in organized sections called segments. The COFF (Common Object File Format) standard underlies most modern object file formats on Windows and Linux.
The default name for an object file on Unix-like systems is a.out. Its typical structure includes:
ELF header : basic metadata such as version, target architecture, and entry point.
Text segment : contains the executable code.
Data segment : stores initialized global variables.
BSS segment : holds uninitialized data.
RO‑data segment : read‑only data such as string literals.
Comment segment : compiler version information.
Relocation segment : records where external symbols need their addresses fixed during linking.
Symbol table : lists all symbols (functions, variables) with their attributes.
String table : stores symbol names as strings.
Below is a typical layout of an a.out file (image omitted for brevity).
Examining a Simple Hello World
The following C source is compiled with gcc hello.c producing a.out:
#include<stdio.h>
int main() {
int a = 5;
printf("hellow world
");
return 0;
}Running objdump -h a.out shows the six segments described above. objdump -s a.out displays the raw hexadecimal contents, revealing the string "hellow world" in the .rodata segment and compiler version info in the .comment segment.
Disassembling with objdump -d a.out shows the assembly for main, while the relocation entries indicate where external symbols like printf will be patched after linking.
A Simple Overview of Linking
Linking combines multiple object files into a single executable. Static linking resolves all symbols before execution, resulting in larger binaries but no runtime dependency. Dynamic linking defers resolution until the program loads, allowing shared libraries to be loaded once and used by multiple programs.
A Simple Explanation of Loading
Loading maps the executable’s virtual addresses to physical memory. Modern operating systems use virtual memory, so each process sees a full address space while the OS translates virtual addresses to actual physical locations. The loader assigns virtual addresses to each segment, then the CPU’s program counter starts at the entry point defined in the ELF header.
Summary
This article walks through the hidden steps from source code to a running program, covering object file structure, sections, symbol and relocation tables, static vs. dynamic linking, and the loading process that finally places the code into memory for execution.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITFLY8 Architecture Home
ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
