Understanding Cache: Concepts, Mechanisms, and Consistency
This article provides a comprehensive overview of cache memory, explaining why caches are needed, their placement strategies, operation principles, replacement policies, write handling methods, and coherence protocols such as MESI, offering essential knowledge for computer architecture and system design.
1. Why Cache is Needed
CPU performance has increased dramatically while DRAM memory speed lags behind, creating a gap where capacity and speed cannot be simultaneously achieved. Caches bridge this gap by exploiting data locality, storing frequently accessed data in a small, fast memory.
1.1 Why Cache is Needed
Because most program accesses exhibit spatial and temporal locality, placing data that are close together in memory into a fast cache dramatically reduces access latency.
2. Cache Working Principles
2.1 Data Placement
Cache lines can be organized in three ways:
Fully associative – a block may be placed in any line.
Direct mapped – a block can only occupy a specific line (e.g., block 12 → line 12 mod 8).
Set associative – a block can be placed in one of several lines within a set (e.g., 2‑way set associative).
Example: a main memory with 32 blocks and a cache with 8 lines. To store block 12, the placement method determines which line(s) can hold it.
for (j = 0; j < 100; j = j + 1)
for (i = 0; i < 5000; i = i + 1)
x[i][j] = 2 * x[i][j];2.2 Data Lookup
When the CPU requests data, the cache uses the address to extract three fields: tag, index (set number), and block offset. The index selects the set, and the tags of the lines in that set are compared in parallel. A tag match indicates a cache hit; otherwise a miss occurs.
2.3 Replacement Policies
If a miss occurs and the set is full, the cache must evict a line. Common policies are:
Random replacement.
Least Recently Used (LRU).
First‑In‑First‑Out (FIFO).
2.4 Write Handling
Writes can be managed by three strategies:
Write‑through: data is written to both cache and main memory simultaneously.
Write‑back: data is written only to the cache and flushed to memory when the line is evicted.
Write‑buffer (write‑through + buffer): writes are first placed in a buffer, allowing the cache to continue operating while the buffer drains to memory.
3. Cache Coherence
In multi‑core systems, caches can become inconsistent when different cores read and write the same memory location. Coherence protocols ensure that all cores see a consistent view.
3.1 Listener‑Based Protocols
All caches monitor write operations. Two variants exist:
Write‑update: a write updates all other caches.
Write‑invalidate: a write invalidates the corresponding line in other caches.
3.2 Directory‑Based Protocols
The memory controller maintains a directory that tracks which caches hold each block. Common states include:
SI: Shared or Invalid.
MSI: Modified, Shared, Invalid.
MESI: Modified, Exclusive, Shared, Invalid.
The MESI protocol is widely used. For example, a line in the Exclusive state can be written without notifying other caches, transitioning to Modified.
4. Summary
Cache plays a crucial role in modern computer architecture by providing fast access to frequently used data, reducing the performance gap between CPU and main memory. Understanding its placement, lookup, replacement, write policies, and coherence mechanisms is essential for designing efficient systems.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.