Understanding Cache Mapping: Direct, Fully‑Associative, and Set‑Associative Explained
This article explores the role of cache in computer systems and provides a detailed comparison of the three main cache mapping techniques—direct mapping, fully associative mapping, and set‑associative mapping—covering their mechanisms, advantages, disadvantages, and practical selection guidelines.
In computer systems, cache acts as a high‑speed pathway that accelerates data access, dramatically improving overall performance.
1. Direct Mapping: Simple Locator
Direct mapping assigns each main‑memory block to a fixed cache line using the modulo operation (block number % number of cache lines). For example, with 8 cache lines, blocks 0, 8, 16,… map to line 0, blocks 1, 9, 17,… map to line 1, and so on.
1.1 Detailed Workflow
When the CPU accesses data, the main‑memory address is split into three fields: tag, index (obtained by modulo), and block offset. The index selects a cache line; the tag is compared with the stored tag. A match with a valid line yields a cache hit; otherwise the block is fetched from memory and the cache line is updated.
1.2 Main‑Memory and Cache Address Formats
Main‑memory address fields: block number (S bits) and block offset (W bits). Cache address fields: line number (m bits) and line offset (w bits). The tag stored in the cache line corresponds to the memory block number.
1.3 Advantages and Disadvantages
Advantages: simple hardware, low cost, fast address translation. Disadvantages: high conflict rate because each memory block has only one possible cache line, leading to reduced hit rate and poor cache utilization.
1.4 Example Case (Fully‑Associative Mapping Example)
Given a 1 MB main memory, 16 KB cache, and 512 B block size, the cache address is 14 bits, block offset is 9 bits, leaving 5 bits for the line index (32 lines). The main‑memory address is 20 bits, with 11 bits for the block tag (2048 blocks).
2. Fully‑Associative Mapping: Flexible Data Box
Fully‑associative mapping allows any memory block to be placed in any cache line, eliminating fixed positions and reducing conflict probability.
2.1 Lookup Process
The CPU compares the tag of the requested address with the tags of all cache lines simultaneously. If a matching valid line is found, the data is returned (cache hit); otherwise the block is fetched from main memory (cache miss).
2.2 Main‑Memory Address Format
If the main memory has 2ⁿ units divided into 2ˢ blocks of 2ʷ units each, the address consists of s + w bits (s bits for block number, w bits for offset). The cache address consists of m + w bits (m bits for line index, w bits for offset).
2.3 Overall Evaluation
Pros: highest possible hit rate and cache utilization. Cons: hardware complexity is high because every tag comparison requires many comparators, leading to slower access and higher cost, making it suitable mainly for small‑capacity caches.
3. Set‑Associative Mapping: The Harmonizer
Set‑associative mapping combines the simplicity of direct mapping with the flexibility of fully‑associative mapping. The cache is divided into several sets; each set contains multiple ways (cache lines). A memory block first maps to a specific set (using modulo), then can occupy any way within that set.
3.1 Working Mechanism
The CPU extracts the set index from the address, locates the corresponding set, and then compares the tag with each way inside the set. A matching valid line yields a hit; otherwise the block is fetched from memory and placed into one of the ways, possibly evicting an existing line.
3.2 Performance Assessment
Set‑associative caches achieve a good balance: they reduce conflict misses compared to direct mapping while keeping hardware complexity lower than fully‑associative caches. They are widely used in modern CPUs for L1 and L2 caches.
4. Choosing a Mapping Strategy
Selection depends on cost, performance, and cache size. Direct mapping is cheapest and fastest for simple, low‑cost systems. Fully‑associative mapping offers the highest hit rate for high‑performance, cost‑insensitive systems. Set‑associative mapping provides a balanced solution for most desktop, server, and mobile processors.
5. Cache Mapping Case Study
Modern Intel CPUs use a set‑associative hierarchy. For example, a 32 KB L1 data cache may have 64 sets with 8 ways each (512 lines of 64 B). Physical memory is divided into 4 KB pages; each page contains 64 cache lines.
In a fully‑associative cache, any line can be stored in any slot, but the search cost is high. Intel CPUs therefore use set‑associative caches where each line’s tag identifies its originating page, and the set index (bits 6‑11 of the address) selects the set. Tag comparison within a set is performed in parallel, yielding fast hits.
Cache lines also store state information (MESI protocol). Intel’s caches are inclusive: data present in L1 is also replicated in L2.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Deepin Linux
Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
