Mastering GNU Linker Scripts: Control Binary Layout and Memory Mapping

This guide explains how GNU ld linker scripts work, covering their purpose, basic concepts like sections, symbols, VMA/LMA, the syntax of SECTIONS, INPUT, OUTPUT, and other commands, with practical examples and code snippets for embedded and desktop ELF binaries.

Liangxu Linux
Liangxu Linux
Liangxu Linux
Mastering GNU Linker Scripts: Control Binary Layout and Memory Mapping

Overview

The GNU linker ( ld) uses linker scripts (usually with .lds extension) to control how input object files are combined into an output binary and how the resulting sections are placed in the program’s address space.

Basic Concepts

Each input file (object or another script) provides sections such as .text, .data, and .bss. The linker script defines how these input sections are mapped to output sections and assigns virtual memory addresses (VMA) and load memory addresses (LMA). Loadable sections are copied into memory at runtime, while allocatable sections reserve space that may be zero‑filled.

Symbols in object files have addresses; you can view them with objdump -h or nm. Debug sections are usually non‑loadable.

Script Syntax

A linker script consists of commands separated by semicolons ( ;). Comments are written between /* and */. The core command is SECTIONS, which contains output‑section descriptions.

SECTIONS {
  /* output‑section commands */
}

Key Commands

ENTRY(symbol)

: sets the program entry point. INCLUDE "file.ld": includes another script (similar to C #include). INPUT(files): lists files that become input to the link. GROUP(files): repeats scanning of the listed libraries until no new undefined references appear. OUTPUT(filename) or -o: defines the output file name. SEARCH_DIR(path) or -L: adds a directory to the library search path. STARTUP(file): forces a file to be the first input. OUTPUT_FORMAT(format) or -b: selects the output BFD format. OUTPUT_ARCH(arch) or -m: sets the target architecture. ASSERT(expr, "msg"): aborts linking if expr is false. EXTERN(symbol ...): forces undefined symbols to be treated as if -u were given.

Simple Example

SECTIONS {
  . = 0x10000;
  .text : { *(.text) }
  . = 0x8000000;
  .data : { *(.data) }
  .bss : { *(.bss) }
}

This script places the .text section at address 0x10000 and the .data section at 0x8000000. The dot ( .) is the location counter; assigning a value moves it, and ALIGN(8) can be used to align it to an 8‑byte boundary.

Symbol Assignment

Symbols can be given values inside the script, similar to C assignments but affecting addresses:

my_symbol = 0x2000;

Assignments must follow the syntax SYMBOL = EXPRESSION ; and can appear in three places: globally, inside a SECTIONS block, or within an output‑section description.

Output Section Description

The general form is:

SECTION [ADDRESS] [(TYPE)] : [AT(LMA)] {
  OUTPUT‑SECTION‑COMMANDS
} [>REGION] [AT>LMA_REGION] [:PHDR …] [=FILLEXP]

Address sets the VMA; AT(LMA) sets a different load address. Types such as NOLOAD, DSECT, COPY affect how the section is treated at runtime.

Common Section Commands

SYMBOL = EXPRESSION ;

– assign address. *(.text) – include all .text sections from all input files. *(COMMON) – include common symbols (usually placed in .bss). KEEP(*(.text)) – prevent garbage collection of the listed sections.

Memory Regions

Using the MEMORY command you can define named memory regions with origin and length, then assign output sections to them with > REGION. Example:

MEMORY {
  rom (rx) : ORIGIN = 0x0, LENGTH = 256K
  ram (!rx) : ORIGIN = 0x40000000, LENGTH = 4M
}

Sections not explicitly placed will be put into a suitable region based on their attributes.

Program Headers (PHDRS)

For ELF outputs you can describe program headers explicitly:

PHDRS {
  headers PT_PHDR PHDRS ;
  text PT_LOAD FILEHDR PHDRS ;
  data PT_LOAD ;
}

Each header can have a type (e.g., PT_LOAD, PT_DYNAMIC) and optional flags. Sections can be associated with a header using the :PHDRS attribute.

Version Scripts

When building shared libraries, version scripts control symbol versioning:

VERSION {
  GLIBC_2.0 {
    global: printf; }
}

The script can be passed to the linker via --version-script.

Practical Tips

Use ld --verbose to view the default built‑in script.

Use ld -M to generate a map file showing how input sections are placed.

Remember that each input section can be used only once in the output description; duplicate use results in empty sections.

When using --gc-sections, keep needed sections with KEEP().

By understanding these commands and concepts, you can precisely control the layout, memory usage, and loading behavior of ELF binaries for both embedded systems and desktop applications.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

ELFembeddedgccBinaryLinkerlinker-scriptld
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.