Unveiling Swift’s Dictionary: Inside the Hash Table and Memory Layout
This article dives deep into Swift's Dictionary implementation, explaining its underlying hash‑table structure, memory layout, variant storage, probing strategies, copy‑on‑write semantics, conflict resolution, load‑factor handling, and automatic resizing, all illustrated with real code and debugger screenshots.
Introduction
After covering Swift's Array, we explore the inner workings of Dictionary, whose design shares similarities with Array but centers on hash‑based storage.
What is Dictionary
In Swift, Dictionary is a generic object that stores key‑value pairs, where the key must conform to Hashable.
Memory Inspection
Running a simple test program and setting a breakpoint on print("end") reveals that a Dictionary instance lives on the heap (e.g., address 0x00000001005469b0) and shares the same HeapObject layout as an Array. The debugger shows the stored element count but not the actual keys or values.
Dictionary Structure
The source defines struct Dictionary<Key: Hashable, Value> with a single stored property _variant of type _Variant. The _Variant struct contains a _BridgeStorage<__RawDictionaryStorage> object, mirroring the storage pattern used by Array.
4.1 Dictionary Definition
@frozen
public struct Dictionary<Key: Hashable, Value> {
public typealias Element = (key: Key, value: Value)
internal var _variant: _Variant
internal init(_native: __owned _NativeDictionary<Key, Value>) {
_variant = _Variant(native: _native)
}
#if _runtime(_ObjC)
internal init(_cocoa: __owned __CocoaDictionary) {
_variant = _Variant(cocoa: _cocoa)
}
#endif
}The _Variant struct holds a single property object of type _BridgeStorage<__RawDictionaryStorage>, which in turn points to the raw storage used by the dictionary.
extension Dictionary {
@frozen internal struct _Variant {
internal var object: _BridgeStorage<__RawDictionaryStorage>
}
}4.2 Native Dictionary
The native storage is represented by _NativeDictionary, which contains a _storage field of type __RawDictionaryStorage. This raw storage holds the hash table metadata, keys, values, and auxiliary fields such as count, capacity, scale, and seed.
@frozen internal struct _NativeDictionary<Key: Hashable, Value> {
internal var _storage: __RawDictionaryStorage
internal init() { self._storage = __RawDictionaryStorage.empty }
internal init(_ storage: __owned __RawDictionaryStorage) { self._storage = storage }
internal init(capacity: Int) { /* allocate appropriate storage */ }
}4.3 Raw Dictionary Storage
The class __RawDictionaryStorage tail‑allocates memory for a bitmap, keys, and values. It tracks the number of occupied entries ( _count), capacity ( _capacity), scale ( _scale), and a hash seed ( _seed).
internal class __RawDictionaryStorage: __SwiftNativeNSDictionary {
internal var _count: Int
internal var _capacity: Int
internal var _scale: Int8
internal var _reservedScale: Int8
internal var _extra: Int16
internal var _age: Int32
internal var _seed: Int
internal var _rawKeys: UnsafeMutableRawPointer
internal var _rawValues: UnsafeMutableRawPointer
internal var _hashTable: _HashTable {
// computed property returning a HashTable view
}
}4.4 Hash Table
The hash table is a lightweight struct that stores an array of Word (a UInt) representing bucket occupancy and a mask used for fast modulo operations.
internal struct _HashTable {
internal typealias Word = _UnsafeBitset.Word
internal var words: UnsafeMutablePointer<Word>
internal let bucketMask: Int
internal init(words: UnsafeMutablePointer<Word>, bucketCount: Int) {
self.words = words
self.bucketMask = bucketCount - 1
}
internal func bucket(wrappedAfter bucket: Bucket) -> Bucket {
return Bucket(offset: (bucket.offset + 1) & bucketMask)
}
internal func idealBucket(forHashValue hash: Int) -> Bucket {
return Bucket(offset: hash & bucketMask)
}
internal func _isOccupied(_ bucket: Bucket) -> Bool { /* check bitmap */ }
}4.5 Probing and Collision Resolution
Swift uses open addressing with quadratic probing (referred to as “linear probing” in the source comments) to resolve collisions. The find method walks buckets until it finds an empty slot or the matching key.
internal final func find<Key: Hashable>(_ key: Key, hashValue: Int) -> (bucket: _HashTable.Bucket, found: Bool) {
var bucket = hashTable.idealBucket(forHashValue: hashValue)
while hashTable._isOccupied(bucket) {
if uncheckedKey(at: bucket) == key { return (bucket, true) }
bucket = hashTable.bucket(wrappedAfter: bucket)
}
return (bucket, false)
}4.6 Copy‑On‑Write
Although Dictionary is a struct, it employs reference semantics internally. Before a mutating operation, isUniquelyReferenced() checks whether the storage is shared; if not, a new copy of the storage is created.
internal mutating func setValue(_ value: __owned Value, forKey key: Key) {
let isUnique = self.isUniquelyReferenced()
asNative.setValue(value, forKey: key, isUnique: isUnique)
}4.7 Load Factor and Resizing
The dictionary monitors its load factor (ratio of occupied slots to total slots). When the factor exceeds roughly 0.75, it allocates a larger storage (typically double the size), recomputes the hash for each key, and re‑inserts them (rehash).
static internal func allocate(scale: Int8, age: Int32?, seed: Int?) -> _DictionaryStorage {
let bucketCount = 1 << scale
// allocate tail‑elements for bitmap, keys, values
// initialize metadata, seed, etc.
return storage
}4.8 Hash Function Basics
Swift’s built‑in hash function produces a 64‑bit value. The dictionary then maps it to a bucket using hashValue & bucketMask, which is equivalent to a modulo operation when the bucket count is a power of two. This fast bitwise masking replaces expensive division.
Conflict‑Resolution Strategies
Swift’s Dictionary uses open addressing (specifically quadratic probing) rather than separate chaining. This keeps all entries in a contiguous array, improving cache locality and maintaining average O(1) lookup and insertion.
Performance Optimizations
The implementation chooses bucket counts that are powers of two, enabling the cheap mask operation. It also pre‑computes a scale value based on the desired capacity and maximum load factor, ensuring the hash table grows efficiently.
Hash Functions in Practice
Beyond the built‑in hash, developers can use cryptographic hashes such as MD5, SHA‑1, or SHA‑256. When the hash output is larger than the table size, a simple modulo (or mask) reduces it to the appropriate range without losing the uniform distribution properties.
Conclusion
The article has walked through the essential components of Swift’s Dictionary: the public struct, its variant bridge, native storage, raw storage layout, hash table mechanics, probing, copy‑on‑write, load‑factor‑driven resizing, and the underlying hash function.
Sohu Tech Products
A knowledge-sharing platform for Sohu's technology products. As a leading Chinese internet brand with media, video, search, and gaming services and over 700 million users, Sohu continuously drives tech innovation and practice. We’ll share practical insights and tech news here.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
