Fundamentals 21 min read

Understanding CPU Cache, Memory Hierarchy, and Concurrency Control

This article explains the principles of CPU cache, the multi‑level memory hierarchy, virtual memory, data consistency, and concurrency control mechanisms, illustrating how they together bridge the speed gap between the fast processor and slower memory/storage in modern computer systems.

Deepin Linux
Deepin Linux
Deepin Linux
Understanding CPU Cache, Memory Hierarchy, and Concurrency Control

In computer system interviews, CPU cache is a key topic that reveals a candidate’s deep understanding of performance optimization.

A: Theoretical Overview

Based on the von Neumann architecture, both code and data reside in memory.

When a C++ program runs on Linux, the loader (e.g., ld‑linux.so) loads the ELF executable into memory, initializing text, data, BSS, heap and stack segments.

The CPU processes instructions through the classic five‑stage pipeline: Fetch, Decode, Execute, Memory Access, and Write‑Back.

B: Memory Hierarchy

The hierarchy consists of registers, multiple levels of cache (L1, L2, L3), main RAM, and external storage such as HDD/SSD.

Registers are the fastest, smallest storage inside the CPU. Cache sits between CPU and RAM, with L1 split into instruction and data caches, L2 larger and slightly slower, and L3 shared among cores.

RAM provides large capacity but slower access; disks offer massive capacity but much slower speeds, relying on OS paging when RAM is insufficient.

C: Memory Read/Write Process

Read: CPU sends address on the address bus and receives data on the data bus.

Write: CPU sends address and data on the respective buses.

//@author www.yaoxiaowen.com
int sum1(int array[N])
{
    int i, sum = 0;
    for(i = 0; i < N; i++)
        sum += array[i];
    return sum;
}

The example demonstrates good temporal locality for sum and i , and spatial locality for the sequentially accessed array .

//@author www.yaoxiaowen.com
int sum2(int array[M][N])
{
    int i, j, sum = 0;
    for(i = 0; i < M; i++){
        for(j = 0; j < N; j++)
            sum += array[j][i];
    }
    return sum;
}

This second example exhibits poor spatial locality because it accesses the array column‑wise, causing many cache misses.

D: Data Consistency and Concurrency Control

Multi‑core CPUs have private caches; without coordination, caches can become inconsistent. Write‑through and write‑back policies manage how modifications propagate to memory.

Write‑through updates both cache and memory immediately; write‑back marks a cache line dirty and writes back only when evicted.

Cache‑coherency protocols such as bus snooping broadcast write requests so other cores can invalidate or update their copies, though scalability suffers as core count grows.

Common concurrency primitives include mutexes, read‑write locks, spinlocks, reentrant locks, condition variables, optimistic and pessimistic locks, semaphores, etc.

E: Role of Virtual Memory

Linux uses virtual memory to give each process an isolated address space, with the MMU translating virtual to physical addresses, allowing the OS to reclaim resources instantly when a process exits.

F: Full Summary

The CPU fetches instructions from memory, decodes them, executes operations, accesses memory when needed, and writes results back to registers or memory. Linux’s virtual memory, paging, and caching mechanisms (CPU caches, disk caches) together bridge the speed gap between the fast CPU and slower memory/storage, ensuring efficient and secure execution of programs.

CacheconcurrencyVirtual MemoryCPUComputer Architecturememory hierarchyWrite Policies
Deepin Linux
Written by

Deepin Linux

Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.