Fundamentals 8 min read

How Python’s dict Uses Hash Tables and Open Addressing Explained

This article explains how Python implements dictionaries with hash tables, details the hash function for strings, demonstrates collision handling via open addressing and quadratic probing, and walks through the underlying C structures, initialization, insertion, resizing, and deletion processes.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
How Python’s dict Uses Hash Tables and Open Addressing Explained

Dictionary Basics

In Python, a dictionary is an associative array accessed by keys, which can be viewed as two linked arrays. Adding three key/value pairs illustrates basic insertion and the resulting KeyError when accessing a non‑existent key.

Hash Tables

Python dictionaries are built on hash tables: keys are processed by a hash function to produce an index in an underlying array. The hash function aims to distribute keys uniformly, though collisions can occur when different keys share the same hash value.

For string keys, Python uses a simple hash function; for example, hash('a') on a 64‑bit platform yields 12416037344. With an array length of 8, the index is computed as hash('a') & 7 = 0. Keys 'b' and 'z' both map to index 3, demonstrating a collision.

Open Addressing

Python resolves collisions with open addressing, employing a quadratic probing sequence. The probing algorithm repeatedly computes 5*j + 1 to generate candidate slots, using a perturb variable to vary the hash bits.

When the array length is 32, the probe sequence for j progresses as 3 → 11 → 19 → 29 → 5 → 6 → 16 → 31 → 28 → 13 → 2 …

C Implementation of dict

The underlying C structure stores each entry’s hash, key, and value. Important fields include ma_fill (active + dummy slots), ma_used (active slots), ma_mask (array length‑1 for indexing), ma_table (the array), and ma_smalltable (initial 8‑slot array).

Dictionary Initialization

When a dict is first created, PyDict_New() allocates the initial table. The pseudo‑code highlights key steps such as setting the mask and initializing counters.

Adding Items

Insertion uses PyDict_SetItem(), which checks that the key is a string, computes its hash, and calls insertdict(). If active slots exceed two‑thirds of the array, the table is resized to maintain O(1) lookup performance.

The insertdict() function relies on lookdict_string() to find a free slot, applying the same hash‑and‑mask calculation and, if necessary, the quadratic probing sequence. The first probe that encounters a dummy slot prefers that slot for reuse.

Resizing

When more than 2/3 of the slots are occupied, dictresize() allocates a larger array—at least four times the number of active slots for moderate sizes, or twice for very large tables—to reduce future collisions. The new length is chosen by repeatedly doubling until it exceeds the required minimum (e.g., 8 → 16 → 32).

Deleting Items

Deletion calls PyDict_DelItem(), which computes the key’s hash, locates the entry via lookdict_string(), and marks the slot as a dummy. The table does not shrink immediately; however, subsequent insertions may trigger a resize if the combined count of active and dummy slots crosses the threshold.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Pythonhash tabledictc-implementationopen-addressing
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.