Unlocking Linux: Deep Dive into the 2.2.5 Kernel Boot Process
This article explores the motivations for analyzing Linux kernel source code and provides a comprehensive guide to navigating the 2.2.5 i386 kernel tree, detailing the boot sequence—from BIOS to real mode initialization, bootsect loader, setup, and protected‑mode startup—while highlighting key files, structures, and parameters.
One of Linux's greatest advantages is its open source code, which attracts countless enthusiasts who study and modify the kernel to deepen their understanding of computer technology.
Analyzing the kernel provides a strong sense of achievement and teaches low‑level concepts such as system boot, interrupt mechanisms, virtual memory, multitasking, and protection, as well as overall OS design principles and professional coding practices.
Method 1: Locating Modules and Understanding the Source Tree
The Linux kernel source is typically installed under /usr/src/linux. Its top‑level entries include:
COPYING : GPL license notice.
CREDITS : List of major contributors.
MAINTAINERS : Maintainer information for each subsystem.
Makefile : Organises module compilation and dependencies.
ReadMe : Brief introduction and build instructions.
Rules.make : Common Makefile rules.
REPORTING-BUGS : Bug‑report guidelines.
Arch/ : Architecture‑specific code (e.g., i386).
Include/ : Header files (platform‑independent in include/linux, architecture‑specific in include/asm‑i386).
Init/ : Kernel initialization code ( main.c, Version.c).
Mm/ : Memory management (architecture‑independent and arch/*/mm parts).
Kernel/ : Core kernel functions (e.g., sched.c).
Drivers/ : Device drivers, each in its own subdirectory.
Documentation/ : Helpful documents (mostly in English).
Fs/ : Filesystem implementations.
Ipc/ : Inter‑process communication code.
Lib/ : Kernel libraries.
Net/ : Networking code.
Modules/ : Directory for compiled module objects.
Scripts/ : Configuration scripts.
Each subdirectory usually contains a Makefile and a Readme that are valuable for understanding the code relationships.
Entry Points for Analysis
The two main entry points are the system boot/initialization (from power‑on to kernel execution) and system calls (the interface for user programs).
Method 2: Analyzing the Boot Process
The boot sequence can be divided into three stages: system boot, real‑mode initialization, and protected‑mode initialization.
Boot Loader Files
/Arch/i386/boot/bootsect.S /include/linux/config.h /include/asm/boot.h /include/linux/autoconf.hWhen the machine powers on, the BIOS performs hardware tests and then loads the first sector (MBR) into memory at 0x07C0:0x0000. Control is transferred to this boot sector, which then moves itself to 0x9000:0x0000 and continues execution.
Key constants defined in bootsect.S:
BOOTSEG = 0x07C0 (BIOS load address)
INITSEG = 0x9000 (where bootsect relocates)
SETUPSEG = 0x9020 (setup code segment)
SYSSEG = 0x1000 (system segment)
After relocation, bootsect builds a temporary stack, creates a new disk‑parameter table, loads the setup.S image (four sectors) via BIOS interrupt 0x13, and finally jumps to setup.S to continue initialization.
INITSEG Parameter Table (selected entries)
PARAM_CURSOR_POS (offset 0x0000, 2 bytes) – video cursor position.
extended mem Size (0x0002, 2 bytes) – size of extended memory.
PARAM_VIDEO_PAGE (0x0004, 2 bytes) – video page.
PARAM_VIDEO_MODE (0x0006, 1 byte) – video mode.
PARAM_VIDEO_COLS (0x0007, 1 byte) – columns.
PARAM_VIDEO_LINES (0x000e, 1 byte) – lines.
PARAM_HAVE_VGA (0x000f, 1 byte) – VGA presence flag.
PARAM_FONT_POINTS (0x0010, 2 bytes) – font size.
PARAM_LFB_WIDTH (0x0012, 2 bytes) – linear frame buffer width.
PARAM_LFB_HEIGHT (0x0014, 2 bytes) – height.
PARAM_LFB_DEPTH (0x0016, 2 bytes) – color depth.
PARAM_LFB_BASE (0x0018, 4 bytes) – base address.
Real‑Mode Initialization
Corresponding source file: /Arch/i386/boot/setup.S. This stage builds the INITSEG parameter table, gathers hardware information, and prepares data for the protected‑mode phase.
Protected‑Mode Initialization
Key source files:
/Arch/i386/boot/compressed/head.S /Arch/i386/KERNEL/head.S /Arch/i386/boot/compressed/MISC.c /Arch/i386/boot/setup.S /include/asm/segment.h /arch/i386/kernel/traps.c /include/i386/desc.h /include/asm‑i386/processor.hMajor steps performed:
Decompress the kernel to address 0x100000.
Create the page directory and initial page table ( pg0) and enable paging (virtual memory).
Copy hardware information gathered in real mode to empty_zero_page and initialize the command buffer.
Detect CPU type and check for a coprocessor.
Rebuild the Global Descriptor Table (GDT) and Interrupt Descriptor Table (IDT) for protected mode.
Memory layout after initialization (selected regions): 0x101000 – swapper_pg_dir (page directory, 4 KB). 0x102000 – pg0 (page table, 4 KB). 0x105000 – empty_zero_page (hardware parameters, 2 KB). 0x106000 – gdt_table (global descriptor table, 4192 B).
The analysis demonstrates that tracing the execution flow—from BIOS to real mode, then to protected mode—provides a clear understanding of how the Linux kernel boots and prepares the system for multitasking.
In summary, following the program‑flow‑as‑a‑thread approach, supported by diagrams and careful reading of the relevant source files, is an effective method for dissecting complex kernel code.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
