Fundamentals 27 min read

Unveiling Swift’s Dictionary: Inside the Hash Table Mechanics

This article dives deep into Swift's Dictionary implementation, explaining its underlying hash‑table structure, memory layout, bucket calculation, probing strategies, copy‑on‑write semantics, performance optimizations, and conflict‑resolution techniques, all illustrated with real source code and debugging screenshots.

Sohu Smart Platform Tech Team
Sohu Smart Platform Tech Team
Sohu Smart Platform Tech Team
Unveiling Swift’s Dictionary: Inside the Hash Table Mechanics
Use the simplest language to describe the most difficult technology.

01 Previous Description

Previously we introduced Array; this article explains Dictionary. Although both share design ideas, the core of Dictionary lies in the Hash principle.

02 What Is Dictionary

Dictionary

is the Swift object that stores key‑value pairs.

03 Dictionary Memory View

Run test code in a test project:

var dict = ["1": "Dog", "2": "Cat", "3": "fish"]
withUnsafePointer(to: &dict) {
    print($0)
}
print("end")

Set a breakpoint at print("end") and run:

The memory address of the Dictionary is 0x00000001005469b0, which is on the heap. After inspecting the address we find that the underlying storage is a class structure similar to Array called HeapObject. Apart from the number of stored elements (3), the key and value cannot be directly found, so the following sections explore where they are stored and the underlying principles of Dictionary.

04 Dictionary Underlying Principle

4.1 Dictionary Structure

Looking at the source definition of Dictionary:

@frozen
public struct Dictionary<Key: Hashable, Value> {
    public typealias Element = (key: Key, value: Value)
    internal var _variant: _Variant
    internal init(_native: __owned _NativeDictionary<Key, Value>) {
        _variant = _Variant(native: _native)
    }
    #if _runtime(_ObjC)
    internal init(_cocoa: __owned __CocoaDictionary) {
        _variant = _Variant(cocoa: _cocoa)
    }
    #endif
    // ...
}

The only property is _variant, which is of type _Variant. The _Variant definition is in the Dictionary extension:

extension Dictionary {
    @usableFromInline
    @frozen
    internal struct _Variant {
        internal var object: _BridgeStorage<__RawDictionaryStorage>
        // ...
    }
}
_Variant

only has the object property, whose type is _BridgeStorage. This is similar to the logic used by Array.

4.2 rawKeys and rawValues

By inspecting the storage we can locate the keys and values pointers. Using lldb to print a larger address range reveals the bytes representing the strings "1", "2", "3", "4". Their order is not contiguous, which ties back to the hash principle described earlier.

4.3 Dictionary Initialization Functions

Generating SIL for Dictionary initialization shows:

public init(dictionaryLiteral elements: (Key, Value)...) {
    let native = _NativeDictionary<Key, Value>(capacity: elements.count)
    for (key, value) in elements {
        let (bucket, found) = native.find(key)
        _precondition(!found, "Dictionary literal contains duplicate keys")
        native._insert(at: bucket, key: key, value: value)
    }
    self.init(_native: native)
}

The Bucket structure is simple:

struct Bucket {
    internal var offset: Int
}

After finding the initialization method, we need to locate the core HashTable. The allocation function shows how the storage is laid out:

static internal func allocate(scale: Int8, age: Int32?, seed: Int?) -> _DictionaryStorage {
    let bucketCount = (1 as Int) << scale
    let wordCount = _UnsafeBitset.wordCount(forCapacity: bucketCount)
    let storage = Builtin.allocWithTailElems_3(
        _DictionaryStorage<Key, Value>.self,
        wordCount._builtinWordValue, _HashTable.Word.self,
        bucketCount._builtinWordValue, Key.self,
        bucketCount._builtinWordValue, Value.self)
    // ... initialize metadata ...
    return storage
}

The HashTable struct holds two properties:

struct _HashTable {
    internal var words: UnsafeMutablePointer<Word>
    internal let bucketMask: Int
    // ...
}

Each Word is a UInt that stores bits representing bucket occupancy. The bucket mask is calculated as bucketCount - 1, allowing fast modulo operation via bitwise AND.

4.4 HashTable

The bucket count is 1 << _scale. The scale is derived from the desired capacity using the following function:

static func scale(forCapacity capacity: Int) -> Int8 {
    let capacity = Swift.max(capacity, 1)
    let minimumEntries = Swift.max(
        Int((Double(capacity) / maxLoadFactor).rounded(.up)),
        capacity + 1)
    let exponent = (Swift.max(minimumEntries, 2) - 1)._binaryLogarithm() + 1
    let scale = Int8(truncatingIfNeeded: exponent)
    return scale
}

During insertion, the dictionary uses open addressing with linear probing (the code shows bucket = hashTable.bucket(wrappedAfter: bucket) which simply adds 1 and wraps around using bucketMask).

4.5 Linear Probing

internal func find<Key: Hashable>(_ key: Key, hashValue: Int) -> (bucket: _HashTable.Bucket, found: Bool) {
    let hashTable = _hashTable
    var bucket = hashTable.idealBucket(forHashValue: hashValue)
    while hashTable._isOccupied(bucket) {
        if uncheckedKey(at: bucket) == key {
            return (bucket, true)
        }
        bucket = hashTable.bucket(wrappedAfter: bucket)
    }
    return (bucket, false)
}

The next bucket is computed as:

internal func bucket(wrappedAfter bucket: Bucket) -> Bucket {
    return Bucket(offset: (bucket.offset + 1) & bucketMask)
}

This confirms the use of linear probing.

4.6 Copy‑On‑Write

Although Dictionary is a struct, it uses reference semantics internally. When multiple variables share the same storage, a real copy occurs only on write:

var dict1 = ["a": 1]
var dict2 = dict1
dict2["b"] = 2 // dict2 copies the storage here, dict1 remains unchanged

The implementation checks uniqueness before mutating:

internal mutating func setValue(_ value: __owned Value, forKey key: Key) {
    #if _runtime(_ObjC)
    if !isNative {
        let cocoa = asCocoa
        self = .init(native: _NativeDictionary<Key, Value>(cocoa, capacity: cocoa.count + 1))
    }
    #endif
    let isUnique = self.isUniquelyReferenced()
    asNative.setValue(value, forKey: key, isUnique: isUnique)
}

05 Extension

5.1 Conflict‑Resolution Strategies

Swift Dictionary uses open addressing (Open Addressing) rather than separate chaining. Common probing methods include linear probing, quadratic probing, and double hashing. Swift’s implementation primarily uses quadratic probing to balance performance and space utilization.

5.2 Performance Optimization and Resizing

The performance is closely tied to the load factor (ratio of occupied slots to total slots). When the load factor exceeds a threshold (≈0.75), the dictionary automatically resizes, typically doubling the capacity, rehashing all keys, and updating metadata. Swift also chooses a capacity large enough to avoid frequent resizing.

5.3 Hash Algorithm Basics

Hash functions map a key to an index. Simple methods include division (key % M) where M is a prime, and square‑mid method. In practice, Swift uses built‑in hash functions (e.g., MD5, SHA1) and then reduces the result with modulo to fit the table size.

5.4 Hash Table in Computing

A hash table stores data by mapping keys to positions via a hash function. The function must be surjective and uniformly distribute keys. Common hash functions (MD5, SHA1, SHA256) produce large outputs, which are then reduced with modulo to the table size.

06 Conclusion

This article has explained the core principles of Swift’s Dictionary, covering its memory layout, hash table mechanics, probing, copy‑on‑write, conflict resolution, and performance considerations. The private seed used for hashing can be thought of as similar to an MD5 seed.

07 Reference Documents

Hash Table Algorithm Explained and Implemented: https://blog.csdn.net/u013752202/article/details/51104156

Open Addressing: https://www.cnblogs.com/hongshijie/p/9419387.html

Hash Table (Separate Chaining) Detailed Explanation: https://blog.csdn.net/duan19920101/article/details/51579136/

SwiftData Structureshash tableCopy-on-Writedictionary
Sohu Smart Platform Tech Team
Written by

Sohu Smart Platform Tech Team

The Sohu News app's technical sharing hub, offering deep tech analyses, the latest industry news, and fun developer anecdotes. Follow us to discover the team's daily joys.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.