Understanding Java HashMap: Structure, Operations, and Internals
This article explains the internal structure and behavior of Java's HashMap, covering its node and tree node classes, default parameters, hash calculation, resizing mechanism, and the conditions under which linked lists are transformed into red‑black trees, along with code examples and performance considerations.
Introduction
HashMap first appeared in JDK 1.2 and is based on a hashing algorithm. It allows null keys and values, but it is not thread‑safe, so concurrent use may cause problems.
HashMap data structure in JDK 1.8:
Why are some buckets linked lists while others are red‑black trees?
When the linked‑list length exceeds 8, it is converted to a tree.
Structure
Node is a static inner class of HashMap:
static class Node
implements Map.Entry
{
final int hash;
final K key;
V value;
Node
next;
// constructor
Node(int hash, K key, V value, Node
next) {
this.hash = hash;
this.key = key;
this.value = value;
this.next = next;
}
public final K getKey() { return key; }
public final V getValue() { return value; }
public final String toString() { return key + "=" + value; }
public final int hashCode() { return Objects.hashCode(key) ^ Objects.hashCode(value); }
public final V setValue(V newValue) { V oldValue = value; value = newValue; return oldValue; }
public final boolean equals(Object o) {
if (o == this) return true;
if (o instanceof Map.Entry) {
Map.Entry
e = (Map.Entry
)o;
if (Objects.equals(key, e.getKey()) && Objects.equals(value, e.getValue()))
return true;
}
return false;
}
}TreeNode is the data structure used for the red‑black tree.
static final class TreeNode
extends LinkedHashMap.Entry
{
TreeNode
parent; // red‑black tree links
TreeNode
left;
TreeNode
right;
TreeNode
prev; // needed to unlink next upon deletion
boolean red;
TreeNode(int hash, K key, V val, Node
next) {
super(hash, key, val, next);
}
/** Returns root of tree containing this node. */
final TreeNode
root() {
for (TreeNode
r = this, p;;) {
if ((p = r.parent) == null)
return r;
r = p;
}
}
}Class Definition
public class HashMap
extends AbstractMap
implements Map
, Cloneable, SerializableFields
/** Default initial capacity 16 (must be a power of two) */
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4;
/** Maximum capacity 2^30 */
static final int MAXIMUM_CAPACITY = 1 << 30;
/** Default load factor */
static final float DEFAULT_LOAD_FACTOR = 0.75f;
/** Threshold for converting a bucket list to a tree */
static final int TREEIFY_THRESHOLD = 8;
/** Threshold for converting a tree back to a list */
static final int UNTREEIFY_THRESHOLD = 6;
/** Minimum capacity for treeification */
static final int MIN_TREEIFY_CAPACITY = 64;
transient Node
[] table; // bucket array
transient Set
> entrySet;
transient int size; // number of key‑value pairs
transient int modCount; // structural modification count
int threshold; // resize threshold
final float loadFactor; // load factorConstructors
/** Construct with initial capacity and default load factor */
public HashMap(int initialCapacity) {
this(initialCapacity, DEFAULT_LOAD_FACTOR);
}
/** Default constructor */
public HashMap() {
this.loadFactor = DEFAULT_LOAD_FACTOR; // other fields defaulted
}
/** Construct with initial capacity and load factor */
public HashMap(int initialCapacity, float loadFactor) {
if (initialCapacity < 0)
throw new IllegalArgumentException("Illegal initial capacity: " + initialCapacity);
if (initialCapacity > MAXIMUM_CAPACITY)
initialCapacity = MAXIMUM_CAPACITY;
if (loadFactor <= 0 || Float.isNaN(loadFactor))
throw new IllegalArgumentException("Illegal load factor: " + loadFactor);
this.loadFactor = loadFactor;
this.threshold = tableSizeFor(initialCapacity);
}
/** Returns the smallest power‑of‑two >= cap */
static final int tableSizeFor(int cap) {
int n = cap - 1;
n |= n >>> 1;
n |= n >>> 2;
n |= n >>> 4;
n |= n >>> 8;
n |= n >>> 16;
return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
}Detailed explanation of tableSizeFor
Uses bitwise operations to find the smallest power of two greater than or equal to the given capacity (e.g., 10 → 16).
Subtract 1 from cap to ensure the result is not less than the original value.
Right‑shift and OR repeatedly (1, 2, 4, 8, 16 bits) to fill lower bits with 1s.
Add 1 to obtain the final power‑of‑two.
Load Factor
The load factor determines how full the bucket array may become before resizing. Lowering it reduces collisions (more space, faster operations), while increasing it saves space at the cost of more collisions (slower operations).
In most cases the default 0.75 provides a good balance, so it is not recommended to change it.
Search (get)
public V get(Object key) {
Node
e;
return (e = getNode(hash(key), key)) == null ? null : e.value;
}
static final int hash(Object key) {
int h;
return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}
final Node
getNode(int hash, Object key) {
Node
[] tab; Node
first, e; int n; K k;
if ((tab = table) != null && (n = tab.length) > 0 &&
(first = tab[(n - 1) & hash]) != null) {
if (first.hash == hash && ((k = first.key) == key || (key != null && key.equals(k))))
return first;
if ((e = first.next) != null) {
if (first instanceof TreeNode)
return ((TreeNode
)first).getTreeNode(hash, key);
do {
if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k))))
return e;
} while ((e = e.next) != null);
}
}
return null;
}Add (put)
public V put(K key, V value) {
return putVal(hash(key), key, value, false, true);
}
final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) {
Node
[] tab; Node
p; int n, i;
if ((tab = table) == null || (n = tab.length) == 0)
n = (tab = resize()).length;
if ((p = tab[i = (n - 1) & hash]) == null)
tab[i] = newNode(hash, key, value, null);
else {
Node
e; K k;
if (p.hash == hash && ((k = p.key) == key || (key != null && key.equals(k))))
e = p;
else if (p instanceof TreeNode)
e = ((TreeNode
)p).putTreeVal(this, tab, hash, key, value);
else {
for (int binCount = 0; ; ++binCount) {
if ((e = p.next) == null) {
p.next = newNode(hash, key, value, null);
if (binCount >= TREEIFY_THRESHOLD - 1)
treeifyBin(tab, hash);
break;
}
if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k))))
break;
p = e;
}
}
if (e != null) {
V oldValue = e.value;
if (!onlyIfAbsent || oldValue == null)
e.value = value;
afterNodeAccess(e);
return oldValue;
}
}
++modCount;
if (++size > threshold)
resize();
afterNodeInsertion(evict);
return null;
}Note: A newly created new HashMap() does not allocate the bucket array until the first put operation.
Resize Mechanism
HashMap buckets always have a length that is a power of two. When the number of entries exceeds threshold = capacity × loadFactor , the map is resized to double its current capacity, and all entries are rehashed into the new bucket array.
final Node
[] resize() {
Node
[] oldTab = table;
int oldCap = (oldTab == null) ? 0 : oldTab.length;
int oldThr = threshold;
int newCap, newThr = 0;
if (oldCap > 0) {
if (oldCap >= MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return oldTab;
} else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY && oldCap >= DEFAULT_INITIAL_CAPACITY)
newThr = oldThr << 1;
} else if (oldThr > 0) {
newCap = oldThr;
} else {
newCap = DEFAULT_INITIAL_CAPACITY;
newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
}
if (newThr == 0) {
float ft = (float)newCap * loadFactor;
newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY) ? (int)ft : Integer.MAX_VALUE;
}
threshold = newThr;
@SuppressWarnings({"rawtypes","unchecked"})
Node
[] newTab = (Node
[])new Node[newCap];
table = newTab;
if (oldTab != null) {
for (int j = 0; j < oldCap; ++j) {
Node
e;
if ((e = oldTab[j]) != null) {
oldTab[j] = null;
if (e.next == null)
newTab[e.hash & (newCap - 1)] = e;
else if (e instanceof TreeNode)
((TreeNode
)e).split(this, newTab, j, oldCap);
else {
Node
loHead = null, loTail = null;
Node
hiHead = null, hiTail = null;
Node
next;
do {
next = e.next;
if ((e.hash & oldCap) == 0) {
if (loTail == null) loHead = e;
else loTail.next = e;
loTail = e;
} else {
if (hiTail == null) hiHead = e;
else hiTail.next = e;
hiTail = e;
}
} while ((e = next) != null);
if (loTail != null) {
loTail.next = null;
newTab[j] = loHead;
}
if (hiTail != null) {
hiTail.next = null;
newTab[j + oldCap] = hiHead;
}
}
}
}
}
return newTab;
}Overall Steps of Resizing
Calculate new capacity (newCap) and new threshold (newThr).
Create a new bucket array of size newCap; this also initializes the table.
Rehash existing entries into the new array. TreeNode entries are split; ordinary nodes are grouped while preserving order.
Common Expansion Scenarios
Using the default constructor: the first resize creates a table of size DEFAULT_INITIAL_CAPACITY (16) with threshold = 16 × 0.75 = 12 .
Specifying an initial capacity: the initial threshold equals the supplied capacity, then threshold = capacity × DEFAULT_LOAD_FACTOR .
Subsequent resizes: both the table size and threshold double each time.
Advanced Questions
1. Why does JDK 1.7 use an array plus singly linked list (not doubly linked list) for buckets?
Linked lists resolve hash collisions. A singly linked list uses less memory than a doubly linked list, which is sufficient for collision handling.
2. Why choose a red‑black tree instead of a generic balanced binary tree?
Red‑black trees offer higher insertion efficiency than AVL trees and better lookup performance than plain binary trees, providing a good performance trade‑off.
3. Why must equals and hashCode be overridden together when using objects as HashMap keys?
If two objects are equal according to equals , they must have the same hashCode . HashMap first compares hash codes to quickly eliminate non‑matching keys before invoking the potentially expensive equals method.
4. Why does HashMap not use the raw hashCode() value directly?
static final int hash(Object key) {
int h;
return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}The additional shift‑and‑xor mixes high‑order bits into the low‑order bits, improving the distribution of hash values across buckets.
5. If red‑black trees are so efficient, why does HashMap only convert a bucket to a tree when it contains more than 8 entries?
Tree operations (rotations) add overhead. For small bucket sizes (< 8) a linked list is faster for insertion and lookup. The threshold of 8 was chosen as a practical compromise, similar to the default load factor.
Instead of searching for problems online, follow us now!
Selected Java Interview Questions
A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.