Fundamentals 74 min read

Unlocking Computer Fundamentals: From CPU Basics to Assembly Language Explained

Explore the essential building blocks of modern computing, covering CPU architecture, memory hierarchy, binary operations, compression techniques, operating system fundamentals, and assembly language, with clear explanations, diagrams, and code examples that demystify how hardware and software interact at the lowest level.

macrozheng
macrozheng
macrozheng
Unlocking Computer Fundamentals: From CPU Basics to Assembly Language Explained

CPU

Every programmer dreams of becoming a "big shot," but focusing only on frameworks overlooks the essential foundations of computing. Understanding the CPU, the core component of a computer, is crucial for long‑term growth.

CPU Internal Process

The CPU fetches, decodes, and executes instructions in three stages: fetch, decode, and execute. It extracts instructions from main memory, decodes their meaning, and then performs the required operation.

In this process the CPU interprets the final machine‑language code.

The CPU consists of two main parts: the Control Unit and the Arithmetic Logic Unit (ALU) .

Control Unit: extracts and decodes instructions from memory.

ALU: performs arithmetic and logical operations.

The CPU is the computer’s brain and works together with memory, I/O devices, and registers such as the Program Counter, Control Unit, ALU, and Clock.

Registers

Register

Function

Accumulator

Stores running data and results of calculations.

Flag Register

Reflects the processor’s state and results of operations.

Program Counter

Holds the address of the next instruction to execute.

Base Register

Stores the start address of a memory segment.

Index Register

Stores an offset relative to the base address.

General‑Purpose Register

Stores arbitrary data.

Instruction Register

Holds the currently executing instruction (cannot be accessed directly by programmers).

Stack Register

Points to the start of the stack area.

Only the Program Counter, Accumulator, Flag Register, Instruction Register, and Stack Register exist as a single instance; other registers usually have multiple copies.

Program Counter

The Program Counter (PC) stores the address of the next instruction. When a program starts, the PC points to the first instruction (e.g., address 0100). After each instruction the PC increments by 1, unless a jump instruction changes its value.

Memory

Memory (RAM) is the primary storage that the CPU uses to read and write data during program execution. It is also called the main memory.

Memory is built from many integrated circuits and comes in three major types:

RAM (Random Access Memory) – volatile, loses data when power is removed.

ROM (Read‑Only Memory) – non‑volatile, data persists without power.

Cache – a small, fast memory (L1, L2, L3) placed between CPU and RAM.

Memory Operations

To write a byte to memory, the CPU activates VCC (+5 V) and GND (0 V), selects the address using pins A0‑A9, places the data on D0‑D7, and sets the WR (write) signal to 1. To read, the address is set and the RD (read) signal is set to 1.

Virtual Memory and Disk Interaction

When RAM is insufficient, the operating system uses part of the disk as "virtual memory" (page file). Windows uses a paging system with 4 KB pages; data is swapped between RAM and disk as needed.

Binary

Computers use binary (base‑2) numbers to represent all data. Each bit is either 0 or 1, and the value of a binary number is calculated using powers of two.

For example, the binary 00100111 equals decimal 39 (32 + 4 + 2 + 1).

Shift Operations and Two’s Complement

Left shift adds zeros on the right; right shift can be logical (fills with zeros) or arithmetic (fills with the sign bit). Two’s complement is used to represent negative numbers: invert all bits and add 1.

Compression Algorithms

Compression reduces file size by removing redundancy. Two main categories are lossless (e.g., RLE, Huffman, LZW) and lossy (e.g., JPEG, MPEG). Compression can also be symmetric (encoding and decoding have similar complexity) or asymmetric.

Run‑Length Encoding (RLE)

RLE stores a character followed by its repeat count. The string AAAAAABBCDDEEEEEF becomes A6B2C1D2E5F1, achieving a 30 % compression ratio.

Huffman Coding

Huffman builds a binary tree based on symbol frequencies, assigning shorter codes to more frequent symbols. For the same example, Huffman can compress the data to 5 bytes (40 bits), a 71 % reduction.

File Type

Before

After

Compression Ratio

Text

14862 bytes

4119 bytes

28 %

Image

96062 bytes

9456 bytes

10 %

EXE

24576 bytes

4652 bytes

19 %

Operating System

An OS abstracts hardware and provides APIs for applications. Windows, Linux, and macOS each expose different system calls, making direct porting non‑trivial.

Key OS features include:

32‑bit and 64‑bit versions.

Win32 API (for 32‑bit) and Win64 API (for 64‑bit).

Graphical User Interface (GUI).

WYSIWYG printing.

Multitasking via time‑slicing.

Network and database middleware.

Plug‑and‑Play device driver installation.

Process Creation and API Calls

Applications call OS services through APIs such as MessageBox() (found in user32.dll). Different OSes have different APIs, which is why Windows programs cannot run unchanged on Linux.

Assembly Language and Native Code

CPU can only execute native (machine) code. Assembly language uses mnemonic opcodes (e.g., mov, add) and operands to represent machine instructions, making them readable for humans.

Typical assembly syntax: opcode destination, source. For example, mov eax, ebx copies the value of ebx into eax.

Compiling C to Assembly (Borland C++ 5.5 Example)

// Sample C code
int AddNum(int a, int b) {
    return a + b;
}

void MyFunc() {
    int c;
    c = AddNum(123, 456);
}

Compiling with bcc32 -c -S Sample4.c produces Sample4.asm:

_AddNum proc near
    push ebp
    mov ebp, esp
    mov eax, dword ptr [ebp+8]
    add eax, dword ptr [ebp+12]
    pop ebp
    ret
_AddNum endp

_MyFunc proc near
    push ebp
    mov ebp, esp
    push 456
    push 123
    call _AddNum
    add esp, 8
    pop ebp
    ret
_MyFunc endp

Function Call Mechanism

When MyFunc calls AddNum:

Arguments are pushed onto the stack (right‑to‑left order). call _AddNum pushes the return address and jumps to AddNum. AddNum computes the result in eax and executes ret, which pops the return address.

After the call, the caller cleans the argument space (e.g., add esp, 8).

Registers Used in Calls

ebp

– frame pointer, points to the base of the current stack frame. esp – stack pointer. eax – accumulator, holds return values. ebx, ecx, edx, esi, edi – general‑purpose registers.

Global vs. Local Variables

Global variables are placed in the _DATA (initialized) or _BSS (uninitialized) sections. Local variables reside in registers when possible; otherwise they are allocated on the stack.

_DATA segment dword public use32 'DATA'
    _a1 dd 1
    _a2 dd 2
    _a3 dd 3
    _a4 dd 4
    _a5 dd 5
_DATA ends

_BSS segment dword public use32 'BSS'
    _b1 db 4 dup(?)
    _b2 db 4 dup(?)
    _b3 db 4 dup(?)
    _b4 db 4 dup(?)
    _b5 db 4 dup(?)
_BSS ends

In a function, the compiler may allocate up to five integer locals to registers ( eax, edx, ecx, ebx, esi) and the rest to stack slots like [ebp‑4], [ebp‑8], etc.

Control Flow: Loops and Branches

A C for loop such as:

for (int i = 0; i < 10; ++i) {
    MySub();
}

is compiled to assembly using xor ebx, ebx (initialize i to 0), call _MySub, inc ebx, cmp ebx, 10, and jl short L4 to repeat while i < 10.

Conditional statements use cmp followed by a conditional jump ( jle, jge, jmp) to select the appropriate branch.

Multithreading Pitfalls

When multiple threads modify a shared global variable without synchronization, race conditions can occur. For example, two threads executing:

counter *= 2;

may both read the original value before either writes back, resulting in only a single multiplication. Proper locking or atomic operations are required to avoid this.

Recommended Reading

以后要是再写for循环,我就捶自己!

写代码有这些想法,同事才不会认为你是复制粘贴程序员!

推荐一个项目管理工具,落地基于Scrum的敏捷开发

你还在代码里做读写分离么,试试这个中间件吧!

MySql主从复制,从原理到实践!

前后端分离项目,如何优雅实现文件存储!

2019 我的 Github 开源之路!

Github标星25K+Star,SpringBoot实战电商项目mall出SpringCloud版本啦!

涵盖大部分核心组件使用的 Spring Cloud 教程,一定要收藏哦!

我的Github开源项目,从0到20000 Star!

欢迎关注,点个在看

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

CPUAssemblyOperating SystemMemorycompressionBinary
macrozheng
Written by

macrozheng

Dedicated to Java tech sharing and dissecting top open-source projects. Topics include Spring Boot, Spring Cloud, Docker, Kubernetes and more. Author’s GitHub project “mall” has 50K+ stars.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.