How Does Java’s HashMap Resolve Collisions? From Linked Lists to Red‑Black Trees
This article explains the mechanisms Java’s HashMap uses to handle hash collisions, covering the initial perturbation function, the transition from linked‑list chaining to red‑black tree conversion, the differences between Java 7 and Java 8 implementations, and the resizing logic that mitigates conflict.
During a recent interview a candidate struggled with the question "How does HashMap resolve hash collisions?" This article breaks down the complete solution used by Java's HashMap, from the initial hash perturbation to the advanced red‑black tree conversion, and explains the differences between Java 7 and Java 8 implementations.
1. Solving Hash Collisions – Basic Idea
A hash collision occurs when two distinct keys produce the same hash value and therefore map to the same bucket in a HashMap.
2. Preventing Collisions
When a key is inserted, its hashCode is processed by a perturbation function (multiple XORs and right‑shifts) to spread the hash bits more uniformly before taking the modulo of the array length. This reduces the probability that different keys land in the same bucket.
3. Dealing with Collisions – Chaining (Linked List)
If two keys map to the same bucket, Java 7 stores them in a singly‑linked list attached to that bucket. New entries are added at the head of the list.
public V put(K key, V value) {
if (table == EMPTY_TABLE) {
inflateTable(threshold); // initialize array
}
if (key == null)
return putForNullKey(value); // handle null key
int hash = hash(key); // compute hash
int i = indexFor(hash, table.length); // array index
for (Entry<K,V> e = table[i]; e != null; e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
modCount++;
addEntry(hash, key, value, i); // head‑insert
return null;
}In multithreaded scenarios, head‑insertion can create circular linked lists during concurrent resizing.
4. Dealing with Collisions – Open Addressing
Another strategy is open addressing, where the map searches for the next free slot (e.g., linear probing) instead of creating a list. This saves pointer space but can cause clustering and requires special handling for deletions.
5. Java 7 vs. Java 8 – Data Structure Choice
Java 7 uses only linked lists for collision handling. Java 8 introduces a hybrid approach: when a bucket’s list exceeds a threshold (8 entries) and the table size is at least 64, the list is transformed into a red‑black tree, and new entries are appended at the tail.
6. Java 8 Put Implementation
public V put(K key, V value) {
return putVal(hash(key), key, value, false, true);
}
final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) {
Node<K,V>[] tab; Node<K,V> p; int n, i;
if ((tab = table) == null || (n = tab.length) == 0)
n = (tab = resize()).length; // init or resize
if ((p = tab[i = (n - 1) & hash]) == null)
tab[i] = newNode(hash, key, value, null);
else {
Node<K,V> e; K k;
if (p.hash == hash && ((k = p.key) == key || (key != null && key.equals(k))))
e = p; // same key
else if (p instanceof TreeNode)
e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
else {
for (int binCount = 0; ; ++binCount) {
if ((e = p.next) == null) {
p.next = newNode(hash, key, value, null); // tail‑insert
if (binCount >= TREEIFY_THRESHOLD - 1)
treeifyBin(tab, hash);
break;
}
if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k))))
break;
p = e;
}
}
if (e != null) {
V oldValue = e.value;
if (!onlyIfAbsent || oldValue == null)
e.value = value;
afterNodeAccess(e);
return oldValue;
}
}
++modCount;
if (++size > threshold)
resize();
afterNodeInsertion(evict);
return null;
}7. Treeification Threshold
When a bucket’s list length exceeds 8 and the table length is at least 64, the list is converted into a red‑black tree to guarantee O(log n) lookup time. The threshold of 8 is derived from empirical analysis of collision probability using Poisson distribution.
8. Why Red‑Black Tree Instead of AVL?
Red‑black trees provide a good balance between insertion/deletion cost and lookup performance. They require fewer rotations than AVL trees, making them more suitable for the mixed workload of a HashMap where insertions are frequent.
9. Resizing Logic
Resizing is triggered when the number of entries exceeds threshold = capacity × loadFactor (default 0.75). During resize the table size doubles (always a power of two), and each entry’s index is recomputed as hash & (newCapacity‑1). Because the capacity is a power of two, only the newly added high‑order bit of the hash determines whether an entry stays in its old bucket or moves to the new one, avoiding a full re‑hash.
Understanding these mechanisms helps answer interview questions confidently and write more efficient Java code.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
IT Services Circle
Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
