Fundamentals 30 min read

Uncovering Java’s HashMap: How Arrays, Linked Lists, and Red‑Black Trees Work Together

This article explains the internal structure and algorithms of Java’s HashMap, covering its array‑based storage, collision handling with linked lists and red‑black trees, key methods like put, get, resize, and thread‑safety considerations, complete with code examples and diagrams.

Programmer DD

Oct 10, 2020

Uncovering Java’s HashMap: How Arrays, Linked Lists, and Red‑Black Trees Work Together

1 Overview

HashMap is implemented on top of a hash table; each element is a key‑value pair. Collisions are resolved with a singly linked list. When the number of entries exceeds a threshold, the table automatically grows. HashMap is not thread‑safe and should be replaced by ConcurrentHashMap in multithreaded scenarios. It implements Serializable and Cloneable, allowing serialization and cloning, but it does not guarantee any ordering of the entries.

In Java 8 the implementation was optimized by adding a red‑black tree structure for buckets with many collisions.

2 HashMap Data Structure

In Java the two fundamental structures are arrays and references (pointers). HashMap combines an array with linked lists (and, from Java 8, with red‑black trees). The main array stores Node objects; each bucket may contain a linked list of entries, which can be transformed into a tree when the chain becomes long.

The basic layout can be visualized as an array where each slot holds a linked list of Entry objects. When a new HashMap is created, an array of default capacity 16 is allocated.

3 Three Views and Iterators

HashMap provides three collection views— keySet(), values(), and entrySet() —and corresponding iterators to traverse keys, values, or entries.

public class HashMapExam {
    public static void main(String[] args) {
        Map map = new HashMap(16);
        for (int i = 0; i < 15; i++) {
            map.put(i, new String(new char[]{(char) ('A' + i)}));
        }
        System.out.println("======keySet=======");
        Set set = map.keySet();
        Iterator iterator = set.iterator();
        while (iterator.hasNext()) {
            System.out.println(iterator.next());
        }
        System.out.println("======values=======");
        Collection values = map.values();
        Iterator stringIterator = values.iterator();
        while (stringIterator.hasNext()) {
            System.out.println(stringIterator.next());
        }
        System.out.println("======entrySet=======");
        for (Map.Entry entry : map.entrySet()) {
            System.out.println(entry);
        }
    }
}

4 Source Code Analysis

4.1 Constants

static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // 16
static final int MAXIMUM_CAPACITY = 1 << 30;      // 1,073,741,824
static final float DEFAULT_LOAD_FACTOR = 0.75f;
static final int TREEIFY_THRESHOLD = 8;          // when a bucket's list reaches 8, convert to tree
static final int UNTREEIFY_THRESHOLD = 6;        // when a tree's size falls to 6, convert back to list
static final int MIN_TREEIFY_CAPACITY = 64;      // treeification only occurs if table size >= 64

The load factor of 0.75 balances memory consumption and lookup cost. The thresholds 8 and 6 prevent frequent conversion between list and tree.

4.2 tableSizeFor

This method returns the smallest power‑of‑two value that is greater than or equal to the given capacity.

static final int tableSizeFor(int cap) {
    int n = cap - 1;
    n |= n >>> 1;
    n |= n >>> 2;
    n |= n >>> 4;
    n |= n >>> 8;
    n |= n >>> 16;
    return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
}

Ensuring the capacity is a power of two allows the index calculation to use a bitwise & operation instead of a modulo, which is much faster.

4.3 Node class

static class Node<K,V> implements Map.Entry<K,V> {
    final int hash;
    final K key;
    V value;
    Node<K,V> next;
    // getters, setters, hashCode, equals omitted for brevity
}

Each Node forms a singly linked list within a bucket.

4.4 TreeNode class

static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {
    TreeNode<K,V> parent;
    TreeNode<K,V> left;
    TreeNode<K,V> right;
    TreeNode<K,V> prev; // needed to unlink on deletion
    boolean red;
    // tree‑specific methods omitted for brevity
}

From Java 8 onward, a bucket that exceeds TREEIFY_THRESHOLD is transformed into a red‑black tree to guarantee O(log n) lookup time.

4.5 hash method

Java 8 applies a supplemental hash function that mixes the high and low bits of the original hashCode():

static final int hash(Object key) {
    int h;
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

This improves the distribution of hash values when the table size is a power of two.

4.6 put method

public V put(K key, V value) {
    return putVal(hash(key), key, value, false, true);
}

final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) {
    Node[] tab; Node p; int n, i;
    if ((tab = table) == null || (n = tab.length) == 0)
        n = (tab = resize()).length;
    if ((p = tab[i = (n - 1) & hash]) == null)
        tab[i] = newNode(hash, key, value, null);
    else {
        Node e; K k;
        if (p.hash == hash && ((k = p.key) == key || (key != null && key.equals(k))))
            e = p;
        else if (p instanceof TreeNode)
            e = ((TreeNode)p).putTreeVal(this, tab, hash, key, value);
        else {
            for (int binCount = 0; ; ++binCount) {
                if ((e = p.next) == null) {
                    p.next = newNode(hash, key, value, null);
                    if (binCount >= TREEIFY_THRESHOLD - 1)
                        treeifyBin(tab, hash);
                    break;
                }
                if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k))))
                    break;
                p = e;
            }
        }
        if (e != null) {
            V oldValue = e.value;
            if (!onlyIfAbsent || oldValue == null)
                e.value = value;
            afterNodeAccess(e);
            return oldValue;
        }
    }
    ++modCount;
    if (++size > threshold)
        resize();
    afterNodeInsertion(evict);
    return null;
}

The method first checks for an empty bucket, then handles three cases: direct replacement, insertion into an existing tree, or insertion into a linked list (with possible treeification).

4.7 resize

final Node[] resize() {
    Node[] oldTab = table;
    int oldCap = (oldTab == null) ? 0 : oldTab.length;
    int oldThr = threshold;
    int newCap, newThr = 0;
    if (oldCap > 0) {
        if (oldCap >= MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return oldTab;
        }
        else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY && oldCap >= DEFAULT_INITIAL_CAPACITY) {
            newThr = oldThr << 1; // double threshold
        }
    } else if (oldThr > 0) {
        newCap = oldThr;
    } else {
        newCap = DEFAULT_INITIAL_CAPACITY;
        newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
    }
    if (newThr == 0) {
        float ft = (float)newCap * loadFactor;
        newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY) ? (int)ft : Integer.MAX_VALUE;
    }
    threshold = newThr;
    @SuppressWarnings({"rawtypes","unchecked"})
    Node[] newTab = (Node[])new Node[newCap];
    table = newTab;
    if (oldTab != null) {
        for (int j = 0; j < oldCap; ++j) {
            Node e;
            if ((e = oldTab[j]) != null) {
                oldTab[j] = null;
                if (e.next == null)
                    newTab[e.hash & (newCap - 1)] = e;
                else if (e instanceof TreeNode)
                    ((TreeNode)e).split(this, newTab, j, oldCap);
                else {
                    Node loHead = null, loTail = null;
                    Node hiHead = null, hiTail = null;
                    Node next;
                    do {
                        next = e.next;
                        if ((e.hash & oldCap) == 0) {
                            if (loTail == null) loHead = e;
                            else loTail.next = e;
                            loTail = e;
                        } else {
                            if (hiTail == null) hiHead = e;
                            else hiTail.next = e;
                            hiTail = e;
                        }
                    } while ((e = next) != null);
                    if (loTail != null) {
                        loTail.next = null;
                        newTab[j] = loHead;
                    }
                    if (hiTail != null) {
                        hiTail.next = null;
                        newTab[j + oldCap] = hiHead;
                    }
                }
            }
        }
    }
    return newTab;
}

Resize creates a new array (usually double the size) and re‑hashes existing entries. For linked‑list buckets it may split the list into two new buckets based on the new capacity.

4.8 remove method

final Node removeNode(int hash, Object key, Object value, boolean matchValue, boolean movable) {
    Node[] tab; Node p; int n, index;
    if ((tab = table) != null && (n = tab.length) > 0 &&
        (p = tab[index = (n - 1) & hash]) != null) {
        Node node = null, e; K k; V v;
        if (p.hash == hash && ((k = p.key) == key || (key != null && key.equals(k))))
            node = p;
        else if ((e = p.next) != null) {
            if (p instanceof TreeNode)
                node = ((TreeNode)p).getTreeNode(hash, key);
            else {
                do {
                    if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) {
                        node = e;
                        break;
                    }
                    p = e;
                } while ((e = e.next) != null);
            }
        }
        if (node != null && (!matchValue || (v = node.value) == value || (value != null && value.equals(v)))) {
            if (node instanceof TreeNode)
                ((TreeNode)node).removeTreeNode(this, tab, movable);
            else if (node == p)
                tab[index] = node.next;
            else
                p.next = node.next;
            ++modCount;
            --size;
            afterNodeRemoval(node);
            return node;
        }
    }
    return null;
}

The method handles removal from both tree and list buckets, updating the size and modification count.

4.9 get method

public V get(Object key) {
    Node e;
    return (e = getNode(hash(key), key)) == null ? null : e.value;
}

final Node getNode(int hash, Object key) {
    Node[] tab; Node first, e; int n; K k;
    if ((tab = table) != null && (n = tab.length) > 0 &&
        (first = tab[(n - 1) & hash]) != null) {
        if (first.hash == hash && ((k = first.key) == key || (key != null && key.equals(k))))
            return first;
        if ((e = first.next) != null) {
            if (first instanceof TreeNode)
                return ((TreeNode)first).getTreeNode(hash, key);
            do {
                if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k))))
                    return e;
            } while ((e = e.next) != null);
        }
    }
    return null;
}

Lookup first checks the bucket’s first node, then a possible tree, and finally traverses the linked list.

4.10 Fast‑fail behavior

Iterators capture the map’s modCount at creation. If the map is structurally modified after the iterator is obtained, the iterator’s next() or remove() methods throw ConcurrentModificationException, alerting the developer to a potential thread‑safety issue.

HashIterator() {
    expectedModCount = modCount;
    if (size > 0) {
        Entry[] t = table;
        while (index < t.length && (next = t[index++]) == null)
            ;
    }
}

4.11 Thread‑safety solutions

In single‑threaded code, avoid modifying the map directly while iterating; either use the iterator’s remove() method or finish iteration before calling map mutating methods. In multithreaded environments, wrap the map with Collections.synchronizedMap or use ConcurrentHashMap to obtain a thread‑safe implementation.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Java performance algorithm Data Structure HashMap

Written by

Programmer DD

A tinkering programmer and author of "Spring Cloud Microservices in Action"

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.