Unlocking Computer Fundamentals: From CPU Basics to Assembly Language Explained
Explore the essential building blocks of modern computing, covering CPU architecture, memory hierarchy, binary operations, compression techniques, operating system fundamentals, and assembly language, with clear explanations, diagrams, and code examples that demystify how hardware and software interact at the lowest level.
CPU
Every programmer dreams of becoming a "big shot," but focusing only on frameworks overlooks the essential foundations of computing. Understanding the CPU, the core component of a computer, is crucial for long‑term growth.
CPU Internal Process
The CPU fetches, decodes, and executes instructions in three stages: fetch, decode, and execute. It extracts instructions from main memory, decodes their meaning, and then performs the required operation.
In this process the CPU interprets the final machine‑language code.
The CPU consists of two main parts: the Control Unit and the Arithmetic Logic Unit (ALU) .
Control Unit: extracts and decodes instructions from memory.
ALU: performs arithmetic and logical operations.
The CPU is the computer’s brain and works together with memory, I/O devices, and registers such as the Program Counter, Control Unit, ALU, and Clock.
Registers
Register
Function
Accumulator
Stores running data and results of calculations.
Flag Register
Reflects the processor’s state and results of operations.
Program Counter
Holds the address of the next instruction to execute.
Base Register
Stores the start address of a memory segment.
Index Register
Stores an offset relative to the base address.
General‑Purpose Register
Stores arbitrary data.
Instruction Register
Holds the currently executing instruction (cannot be accessed directly by programmers).
Stack Register
Points to the start of the stack area.
Only the Program Counter, Accumulator, Flag Register, Instruction Register, and Stack Register exist as a single instance; other registers usually have multiple copies.
Program Counter
The Program Counter (PC) stores the address of the next instruction. When a program starts, the PC points to the first instruction (e.g., address 0100). After each instruction the PC increments by 1, unless a jump instruction changes its value.
Memory
Memory (RAM) is the primary storage that the CPU uses to read and write data during program execution. It is also called the main memory.
Memory is built from many integrated circuits and comes in three major types:
RAM (Random Access Memory) – volatile, loses data when power is removed.
ROM (Read‑Only Memory) – non‑volatile, data persists without power.
Cache – a small, fast memory (L1, L2, L3) placed between CPU and RAM.
Memory Operations
To write a byte to memory, the CPU activates VCC (+5 V) and GND (0 V), selects the address using pins A0‑A9, places the data on D0‑D7, and sets the WR (write) signal to 1. To read, the address is set and the RD (read) signal is set to 1.
Virtual Memory and Disk Interaction
When RAM is insufficient, the operating system uses part of the disk as "virtual memory" (page file). Windows uses a paging system with 4 KB pages; data is swapped between RAM and disk as needed.
Binary
Computers use binary (base‑2) numbers to represent all data. Each bit is either 0 or 1, and the value of a binary number is calculated using powers of two.
For example, the binary
00100111equals decimal 39 (32 + 4 + 2 + 1).
Shift Operations and Two’s Complement
Left shift adds zeros on the right; right shift can be logical (fills with zeros) or arithmetic (fills with the sign bit). Two’s complement is used to represent negative numbers: invert all bits and add 1.
Compression Algorithms
Compression reduces file size by removing redundancy. Two main categories are lossless (e.g., RLE, Huffman, LZW) and lossy (e.g., JPEG, MPEG). Compression can also be symmetric (encoding and decoding have similar complexity) or asymmetric.
Run‑Length Encoding (RLE)
RLE stores a character followed by its repeat count. The string
AAAAAABBCDDEEEEEFbecomes
A6B2C1D2E5F1, achieving a 30 % compression ratio.
Huffman Coding
Huffman builds a binary tree based on symbol frequencies, assigning shorter codes to more frequent symbols. For the same example, Huffman can compress the data to 5 bytes (40 bits), a 71 % reduction.
File Type
Before
After
Compression Ratio
Text
14862 bytes
4119 bytes
28 %
Image
96062 bytes
9456 bytes
10 %
EXE
24576 bytes
4652 bytes
19 %
Operating System
An OS abstracts hardware and provides APIs for applications. Windows, Linux, and macOS each expose different system calls, making direct porting non‑trivial.
Key OS features include:
32‑bit and 64‑bit versions.
Win32 API (for 32‑bit) and Win64 API (for 64‑bit).
Graphical User Interface (GUI).
WYSIWYG printing.
Multitasking via time‑slicing.
Network and database middleware.
Plug‑and‑Play device driver installation.
Process Creation and API Calls
Applications call OS services through APIs such as
MessageBox()(found in
user32.dll). Different OSes have different APIs, which is why Windows programs cannot run unchanged on Linux.
Assembly Language and Native Code
CPU can only execute native (machine) code. Assembly language uses mnemonic opcodes (e.g.,
mov,
add) and operands to represent machine instructions, making them readable for humans.
Typical assembly syntax:
opcode destination, source. For example,
mov eax, ebxcopies the value of
ebxinto
eax.
Compiling C to Assembly (Borland C++ 5.5 Example)
<code>// Sample C code
int AddNum(int a, int b) {
return a + b;
}
void MyFunc() {
int c;
c = AddNum(123, 456);
}
</code>Compiling with
bcc32 -c -S Sample4.cproduces
Sample4.asm:
<code>_AddNum proc near
push ebp
mov ebp, esp
mov eax, dword ptr [ebp+8]
add eax, dword ptr [ebp+12]
pop ebp
ret
_AddNum endp
_MyFunc proc near
push ebp
mov ebp, esp
push 456
push 123
call _AddNum
add esp, 8
pop ebp
ret
_MyFunc endp
</code>Function Call Mechanism
When
MyFunccalls
AddNum:
Arguments are pushed onto the stack (right‑to‑left order).
call _AddNumpushes the return address and jumps to
AddNum.
AddNumcomputes the result in
eaxand executes
ret, which pops the return address.
After the call, the caller cleans the argument space (e.g.,
add esp, 8).
Registers Used in Calls
ebp– frame pointer, points to the base of the current stack frame.
esp– stack pointer.
eax– accumulator, holds return values.
ebx, ecx, edx, esi, edi– general‑purpose registers.
Global vs. Local Variables
Global variables are placed in the
_DATA(initialized) or
_BSS(uninitialized) sections. Local variables reside in registers when possible; otherwise they are allocated on the stack.
<code>_DATA segment dword public use32 'DATA'
_a1 dd 1
_a2 dd 2
_a3 dd 3
_a4 dd 4
_a5 dd 5
_DATA ends
_BSS segment dword public use32 'BSS'
_b1 db 4 dup(?)
_b2 db 4 dup(?)
_b3 db 4 dup(?)
_b4 db 4 dup(?)
_b5 db 4 dup(?)
_BSS ends
</code>In a function, the compiler may allocate up to five integer locals to registers (
eax, edx, ecx, ebx, esi) and the rest to stack slots like
[ebp‑4],
[ebp‑8], etc.
Control Flow: Loops and Branches
A C
forloop such as:
<code>for (int i = 0; i < 10; ++i) {
MySub();
}
</code>is compiled to assembly using
xor ebx, ebx(initialize
ito 0),
call _MySub,
inc ebx,
cmp ebx, 10, and
jl short L4to repeat while
i < 10.
Conditional statements use
cmpfollowed by a conditional jump (
jle,
jge,
jmp) to select the appropriate branch.
Multithreading Pitfalls
When multiple threads modify a shared global variable without synchronization, race conditions can occur. For example, two threads executing:
<code>counter *= 2;
</code>may both read the original value before either writes back, resulting in only a single multiplication. Proper locking or atomic operations are required to avoid this.
Recommended Reading
以后要是再写for循环,我就捶自己!
写代码有这些想法,同事才不会认为你是复制粘贴程序员!
推荐一个项目管理工具,落地基于Scrum的敏捷开发
你还在代码里做读写分离么,试试这个中间件吧!
MySql主从复制,从原理到实践!
前后端分离项目,如何优雅实现文件存储!
2019 我的 Github 开源之路!
Github标星25K+Star,SpringBoot实战电商项目mall出SpringCloud版本啦!
涵盖大部分核心组件使用的 Spring Cloud 教程,一定要收藏哦!
我的Github开源项目,从0到20000 Star!
欢迎关注,点个在看
macrozheng
Dedicated to Java tech sharing and dissecting top open-source projects. Topics include Spring Boot, Spring Cloud, Docker, Kubernetes and more. Author’s GitHub project “mall” has 50K+ stars.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.