Swiss Tables in Go 1.24: Open Addressing, SIMD, and Metadata Secrets
The article explains how Go 1.24’s new Swiss Tables hash‑map implementation replaces the traditional bucket‑based design with open addressing, SIMD‑accelerated probing, and metadata separation, detailing the underlying principles, performance advantages, handling of clustering and deletions, and a comparison with previous Go maps and Java’s HashMap.
What is Swiss Tables
Swiss Tables is an efficient hash‑table implementation introduced by Google’s Abseil library. It uses open addressing, SIMD‑accelerated probing, and separates metadata from key‑value storage to achieve high cache‑friendliness and low memory overhead.
How Open Addressing Works
Open addressing stores all key‑value pairs directly in the table and resolves collisions by probing for the next free slot. The typical steps are:
Compute hash
index = hash(key) % table_sizeCheck slot
If the slot is empty, insert the pair.
If occupied, follow the probe sequence.
Probe sequence
Common probing methods include linear probing, quadratic probing, and double hashing.
index = (index + i) % table_size // linear index = (index + c1*i + c2*i*i) % table_size // quadratic index = (index + i * hash2(key)) % table_size // double hashingAdvantages of Open Addressing
High memory efficiency because no extra structures are needed.
Cache‑friendly due to contiguous storage.
Disadvantages of Open Addressing
Clustering can degrade performance when the table is heavily loaded.
Deletion is complex; a slot cannot be simply cleared without breaking the probe chain.
Mitigating Clustering
Clustering appears as primary (from linear probing) or secondary (from identical probe paths). Improvements include quadratic probing, double hashing, dynamic resizing when the load factor exceeds a threshold (e.g., 0.7), and using better hash functions.
Handling Deletions
Deletion is usually performed with a “tombstone” marker. Searches skip tombstones, and insertions can reuse them. When the number of tombstones grows, a full rehash cleans them.
What SIMD Optimization Means
SIMD (Single Instruction, Multiple Data) allows a single instruction to operate on multiple data elements simultaneously. Modern CPUs provide SIMD instruction sets such as SSE, AVX, and NEON.
SIMD Basics
Single‑instruction‑multiple‑data versus traditional single‑instruction‑single‑data.
Vectorization packs several values into a register for parallel processing.
SIMD Use Cases
Numerical kernels (matrix multiplication, dot product).
Image processing.
Hash‑table lookups, where SIMD can compare several slots at once.
Network packet handling.
SIMD in Swiss Tables
Swiss Tables load metadata for multiple slots into a SIMD register and compare them with the target hash in one instruction, reducing branch mispredictions and improving cache utilization.
// Example (pseudo‑code)
__m128i target = _mm_set1_epi32(target_key);
for (int i = 0; i < table_size; i += 4) {
__m128i data = _mm_loadu_si128((__m128i*)&table[i]);
__m128i cmp = _mm_cmpeq_epi32(data, target);
int mask = _mm_movemask_epi8(cmp);
if (mask != 0) return i + __builtin_ctz(mask) / 4;
}Metadata Separation
Metadata stores a portion of the hash (e.g., the high bits) and status flags in a compact array separate from the actual key‑value array. This layout enables fast SIMD scans and reduces memory traffic.
Implementation Sketch
Metadata: [M1, M2, M3, ..., MN]
Key‑Value: [KV1, KV2, KV3, ..., KVN]During a lookup, the metadata array is scanned first; only matching slots trigger a full key comparison.
Performance Benefits
Better cache locality because metadata is tiny and contiguous.
SIMD can filter out non‑matching slots in a single instruction.
Deletion and insertion become cheap by updating only the metadata byte.
Go Maps Before 1.24
Earlier Go maps used a bucket‑based hash table with chaining. Each bucket held up to eight key‑value pairs and an overflow pointer for additional entries.
type bmap struct {
tophash [8]uint8 // high bits of hash
keys [8]Key
values [8]Value
overflow *bmap
}Collision handling relied on scanning the tophash array, then the keys, and following overflow buckets if needed. This design suffered from lower cache efficiency and higher latency under heavy load.
Why Go Did Not Adopt Java’s HashMap
Java’s HashMap also uses chaining, optionally switching to a red‑black tree for long buckets. While simple, this approach incurs pointer chasing and poorer cache behavior, which conflicts with Go’s goals of low‑latency, high‑throughput system programming.
Feature Comparison
Collision handling: Java HashMap uses chaining (list or tree); Go pre‑1.24 uses chaining with overflow buckets; Go 1.24+ uses open addressing (Swiss Tables).
Memory access efficiency: Java and pre‑1.24 Go have scattered nodes and low cache hit rate; Go 1.24+ stores data contiguously, improving cache friendliness.
Implementation complexity: Java’s solution is higher (list + tree); pre‑1.24 Go is moderate; Go 1.24+ is higher due to open addressing and SIMD but yields better performance.
Lookup performance: Java is O(1) typical, O(log n) worst; pre‑1.24 Go is O(n) within a bucket; Go 1.24+ achieves O(1) average.
Visual Example
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Xiao Lou's Tech Notes
Backend technology sharing, architecture design, performance optimization, source code reading, troubleshooting, and pitfall practices
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
