Fundamentals 23 min read

Why Java’s HashMap Combines Arrays, Linked Lists, and Red‑Black Trees

This article explains the fundamentals of hash tables, compares their performance with arrays, linked lists, and binary trees, describes how Java's HashMap handles collisions using chaining and treeification, and details the evolution from JDK 7 to JDK 8, including the design of ConcurrentHashMap for thread safety.

ITFLY8 Architecture Home
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Why Java’s HashMap Combines Arrays, Linked Lists, and Red‑Black Trees

1. What Is a Hash Table

Before discussing hash tables, we compare the basic performance of other data structures for insertion and lookup.

Array : Direct indexing gives O(1) lookup, but insertion or deletion requires shifting elements, O(n). Searching by value is O(n) (or O(log n) for sorted arrays).

Linked List : Insertion and deletion after locating the position are O(1); searching is O(n).

Binary Tree : A balanced ordered binary tree provides O(log n) for insertion, lookup, and deletion.

Compared with these, a hash table stores entries in an array and, ignoring collisions, can perform add, delete, and lookup in O(1) time.

When adding or searching for an element, the key is passed through a hash function f(key) that maps it to an array index; the operation then completes with a single array access.

2. Hash Collisions

If two keys produce the same array index, a hash collision occurs. Java resolves collisions primarily with chaining (linked lists) and, in JDK 8, with red‑black trees when a bucket becomes too long.

JDK 7 uses an array of Entry objects linked together.

JDK 8 uses an array of Node objects that may form a linked list or a red‑black tree.

3. HashMap Implementation Details

HashMap’s core is an Entry array; each Entry holds a key‑value pair.

transient Entry<K,V>[] table = (Entry<K,V>[]) EMPTY_TABLE;

The Entry class is a static inner class.

The overall structure shows an array of buckets, each potentially holding a linked list.

Longer bucket chains degrade performance; JDK 7 inserts new entries at the head of the list, while JDK 8 inserts at the tail.

JDK 7 Insertion (head insertion)

void addEntry(int hash, K key, V value, int bucketIndex) {
    if ((size >= threshold) && (null != table[bucketIndex])) {
        resize(2 * table.length);
        hash = (null != key) ? hash(key) : 0;
        bucketIndex = indexFor(hash, table.length);
    }
    createEntry(hash, key, value, bucketIndex);
}

JDK 8 Insertion (tail insertion)

if (e == null) {
    // reached the end of the list, insert new node
    newNode(hash, key, value, null);
}

4. Default Parameters of HashMap

Why is the default entry array length 16 and why must it be a power of two?

HashMap computes the bucket index as hash & (length‑1). Using a length that is a power of two makes the mask length‑1 consist of all 1‑bits, ensuring the low bits of the hash are used directly, which promotes uniform distribution.

Example: key "book" has hashcode 3029737 (binary 1011100011101011101001). With default length 16, length‑1 = 1111. The index is hashcode & 1111 = 1001 (9).

If the length were 10 (binary 1010), many indices would never appear, breaking uniformity.

The load factor is 0.75; the resize threshold is capacity * loadFactor (e.g., 16 × 0.75 = 12).

threshold = (int)(capacity * loadFactor);

5. HashMap Changes in JDK 8

Replaces Entry with Node, which can be a linked list or a red‑black tree.

If a bucket’s chain exceeds 8 nodes, treeifyBin converts it to a red‑black tree, giving O(log n) lookup.

put operation

JDK 8 inserts first, then resizes if necessary, unlike JDK 7 which resizes before insertion.

get operation

Compute the hash, locate the bucket.

If the bucket contains a TreeNode, use tree search; otherwise traverse the linked list.

6. ConcurrentHashMap Principles

HashMap is not thread‑safe; ConcurrentHashMap adds synchronization to avoid race conditions during rehashing.

JDK 7 implements ConcurrentHashMap with a Segment array; each segment holds its own hash table and uses a lock (lock‑splitting) to reduce contention.

Each segment contains a HashEntry array; the segment size is always a power of two.

private static final int DEFAULT_CONCURRENCY_LEVEL = 16;

During put, the key is hashed twice: first to locate the segment, then to locate the entry within the segment. The segment’s lock (a ReentrantLock) is acquired before modifying the bucket.

int sshift = 0;
int ssize = 1;
while (ssize < DEFAULT_CONCURRENCY_LEVEL) {
    ++sshift;
    ssize <<= 1;
}
int segmentShift = 32 - sshift;
int segmentMask = ssize - 1;

Size calculation may be performed without locking (up to three attempts) or with segment locks if the values differ.

Node (used in JDK 8 ConcurrentHashMap)

static class Node<K,V> implements Map.Entry<K,V> {
    final int hash;
    final K key;
    volatile V val;
    volatile Node<K,V> next;
    Node(int hash, K key, V val, Node<K,V> next) {
        this.hash = hash;
        this.key = key;
        this.val = val;
        this.next = next;
    }
    public final K getKey() { return key; }
    public final V getValue() { return val; }
    public final int hashCode() { return key.hashCode() ^ val.hashCode(); }
    public final String toString(){ return key + "=" + val; }
    public final V setValue(V value) { throw new UnsupportedOperationException(); }
    public final boolean equals(Object o) {
        Object k, v, u; Map.Entry<?,?> e;
        return (o instanceof Map.Entry) &&
               (k = (e = (Map.Entry<?,?>)o).getKey()) != null &&
               (v = e.getValue()) != null &&
               (k == key || k.equals(key)) &&
               (v == (u = val) || v.equals(u));
    }
    Node<K,V> find(int h, Object k) {
        Node<K,V> e = this;
        if (k != null) {
            do {
                K ek;
                if (e.hash == h && ((ek = e.key) == k || (ek != null && k.equals(ek))))
                    return e;
            } while ((e = e.next) != null);
        }
        return null;
    }
}

TreeNode (red‑black tree node)

static final class TreeNode<K,V> extends Node<K,V> {
    TreeNode<K,V> parent;  // red‑black tree links
    TreeNode<K,V> left;
    TreeNode<K,V> right;
    TreeNode<K,V> prev;    // needed to unlink next upon deletion
    boolean red;
    TreeNode(int hash, K key, V val, Node<K,V> next, TreeNode<K,V> parent) {
        super(hash, key, val, next);
        this.parent = parent;
    }
    Node<K,V> find(int h, Object k) { return findTreeNode(h, k, null); }
    final TreeNode<K,V> findTreeNode(int h, Object k, Class<?> kc) {
        if (k != null) {
            TreeNode<K,V> p = this;
            do {
                int ph, dir; K pk; TreeNode<K,V> q;
                TreeNode<K,V> pl = p.left, pr = p.right;
                if ((ph = p.hash) > h)
                    p = pl;
                else if (ph < h)
                    p = pr;
                else if ((pk = p.key) == k || (pk != null && k.equals(pk)))
                    return p;
                else if (pl == null)
                    p = pr;
                else if (pr == null)
                    p = pl;
                else if ((kc != null || (kc = comparableClassFor(k)) != null) &&
                         (dir = compareComparables(kc, k, pk)) != 0)
                    p = (dir < 0) ? pl : pr;
                else if ((q = pr.findTreeNode(h, k, kc)) != null)
                    return q;
                else
                    p = pl;
            } while (p != null);
        }
        return null;
    }
}

7. Summary and Reflections

JDK 8 reduces lock granularity from segment‑level to entry‑level, improving concurrency.

The data structure becomes simpler: no explicit Segment, just a Node array with optional red‑black trees.

Red‑black trees replace long linked lists, giving O(log n) lookup when a bucket grows beyond eight entries.

Using synchronized instead of ReentrantLock benefits from JVM optimizations and lower memory overhead in low‑granularity locking scenarios.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

HashMapConcurrentHashMapData Structureshash table
ITFLY8 Architecture Home
Written by

ITFLY8 Architecture Home

ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.