Fundamentals 12 min read

Linux Boot Process: From BIOS Initialization to Kernel Startup

This article explains the Linux boot sequence on x86 hardware, covering BIOS ROM loading, real‑mode memory layout, the MBR, the setup() assembly routine, transition to protected mode, and the startup_32() function that prepares paging before the kernel begins execution.

政采云技术
政采云技术
政采云技术
Linux Boot Process: From BIOS Initialization to Kernel Startup

Preface

When you press the power button, the Linux operating system must load itself into memory and start executing; early engineers called this process "Bootstrap". The term "bootstrap" literally means a shoe‑lace, metaphorically describing how an OS must start itself.

BIOS

Since the CPU fetches instructions from memory, but memory is empty before power‑on, the first instruction after power‑on comes from ROM. In the early 1970s, read‑only memory (ROM) was introduced, and a system program was burned into it – the Basic Input/Output System (BIOS). The BIOS code is the first instruction executed after power‑on.

Both ROM and RAM are addressed using the same address space; the hardware maps ROM to two regions: one just below 4 GB (0xffffffff) and another just below 1 MB (0xfffff). This mapping lets the CPU fetch and execute BIOS code at a fixed address.

When the computer powers up, a special hardware circuit sets registers such as CS and EIP to fixed values, giving the CPU the physical address 0xfffffff0 for the first instruction.

The instruction at 0xfffffff0 is:

0xfffffff0 : ljmp $0xf000:e05b

This long jump transfers execution from the high‑address region (around 4 GB) to the low‑address region (around 1 MB). Early 80x86 CPUs used 16‑bit segment registers plus a 4‑bit offset, giving a 20‑bit address space of 1 MB, known as real mode. Modern x86 machines still start in real mode for compatibility.

After the long jump, the BIOS program resides below 1 MB and runs in real mode.

Memory layout at this stage:

0 – 640 KB: ordinary RAM for the OS and applications.

640 KB – 1 MB: the 384 KB region mapped to ROM, containing the BIOS code.

The BIOS first performs a hardware self‑test, initializes devices to avoid IRQ and I/O port conflicts, and then reads the first 512 bytes of the selected boot device. If the last two bytes are 0x55 and 0xAA, the device is bootable; this 512‑byte sector is the Master Boot Record (MBR).

The BIOS loads the MBR into RAM at address 0x00007c00 and transfers control to it, completing the BIOS phase.

Linux Boot Assembly – setup()

setup()

The Linux MBR also starts execution at 0x00007c00 and performs the following steps:

Copy the 512‑byte Linux MBR to address 0x00090000.

Set up initial memory layout, assigning base addresses for code and data registers and placing the stack away from the code.

From the second sector of the disk, copy the Linux setup() function to memory starting at 0x00090200.

Continue copying data from the disk until the entire operating system image is in memory.

At this point the executable code resides at 0x00090200, which is the entry point of the setup() assembly routine. The setup() function, placed at offset 0x200 in the kernel image, initializes hardware and creates the environment for the kernel.

Although the BIOS has already performed most hardware initialization, Linux does not rely on BIOS; it uses setup() to perform its own hardware detection and initialization.

After the final memory arrangement, the kernel switches from real mode to protected mode. This requires enabling the A20 line (to allow addressing beyond the original 20‑bit limit) and setting the PE bit in the CR0 register.

The last instruction of setup() jumps to the startup_32() function.

startup_32()

startup_32() decompresses the Linux kernel image (which is stored compressed) and places the uncompressed data at address 0x00100000, then jumps execution there.

This function prepares the system for paging, the memory‑management scheme used by Linux. In protected mode, segment descriptors are used instead of direct physical addresses. Linux simplifies segmentation by setting all segment base addresses to zero and using offsets directly.

To support paging, startup_32() creates temporary page tables that map linear addresses to physical addresses, loads the address of the global page directory into the CR3 register, and sets the PG bit in CR0.

It also installs temporary Interrupt Descriptor Table (IDT) and Global Descriptor Table (GDT) entries, stores their addresses in the IDTR and GDTR registers, and finally transfers control to the C function start_kernel() , where the kernel continues in the C language world.

Conclusion

Linux has been evolving for over 30 years, now powering smartphones, routers, embedded devices, servers, and countless distributions. As an open‑source operating system, it benefits from contributions by thousands of developers worldwide, resulting in a codebase of tens of millions of lines.

This article only covered the initial assembly code executed during Linux boot, but each concept introduced (BIOS, real mode, protected mode, paging, etc.) could be expanded into a full paper. Readers are encouraged to consult the referenced materials for deeper understanding.

References:

https://www.intel.cn/content/www/cn/zh/architecture-and-technology/64-ia-32-architectures-software-developer-vol-1-manual.html

https://xuanxuanblingbling.github.io/ctf/pwn/2020/03/10/bios

kernelLinuxassemblyOperating SystemBIOSboot processfundamentals
政采云技术
Written by

政采云技术

ZCY Technology Team (Zero), based in Hangzhou, is a growth-oriented team passionate about technology and craftsmanship. With around 500 members, we are building comprehensive engineering, project management, and talent development systems. We are committed to innovation and creating a cloud service ecosystem for government and enterprise procurement. We look forward to your joining us.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.