Fundamentals 22 min read

Deep Dive into Java HashMap and ConcurrentHashMap: Implementation, Operations, and Concurrency Mechanics

This article provides a comprehensive, step‑by‑step explanation of Java 7 HashMap and ConcurrentHashMap internals, covering their data structures, put/get algorithms, array initialization, resizing, segment locking, rehashing, and the concurrency considerations that ensure thread‑safe access.

Big Data Technology & Architecture

Feb 15, 2019

Deep Dive into Java HashMap and ConcurrentHashMap: Implementation, Operations, and Concurrency Mechanics

This tutorial walks through the internal implementation of Java 7 HashMap and ConcurrentHashMap , illustrating how they store entries, compute indices, handle collisions, and manage resizing.

HashMap stores entries in an array of buckets, each bucket holding a singly‑linked list of Entry objects. The put method computes the key's hash, finds the bucket index with hash & (length‑1), scans the list for an existing key, updates the value if found, or inserts a new node at the head. Array initialization ensures the capacity is a power of two, and the load factor (default 0.75) determines the resize threshold.

public V put(K key, V value) {
    if (table == EMPTY_TABLE) inflateTable(threshold);
    if (key == null) return putForNullKey(value);
    int hash = hash(key);
    int i = indexFor(hash, table.length);
    for (Entry<K,V> e = table[i]; e != null; e = e.next) {
        if (e.hash == hash && (e.key == key || key.equals(e.key))) {
            V old = e.value;
            e.value = value;
            return old;
        }
    }
    addEntry(hash, key, value, i);
    return null;
}

The addEntry method checks whether the map needs to resize, creates a new Entry, and inserts it at the head of the bucket's list.

private void addEntry(int hash, K key, V value, int bucketIndex) {
    if (size >= threshold && table[bucketIndex] != null) resize(2 * table.length);
    createEntry(hash, key, value, bucketIndex);
}

Resizing creates a new, larger array and transfers existing entries, splitting each old bucket's list between two new positions based on the additional high‑order bit of the hash.

void resize(int newCapacity) {
    Entry[] oldTable = table;
    Entry[] newTable = new Entry[newCapacity];
    transfer(newTable, initHashSeedAsNeeded(newCapacity));
    table = newTable;
    threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
}

The ConcurrentHashMap builds on the same principles but partitions the map into a fixed array of Segment objects (default 16). Each segment acts as an independent hash table protected by its own lock, allowing concurrent writes to different segments.

public ConcurrentHashMap(int initialCapacity, float loadFactor, int concurrencyLevel) {
    // compute segmentShift and segmentMask based on concurrencyLevel
    // initialize segment[0] and lazily create other segments on first use
}

When inserting, the map computes the hash, determines the segment index with (hash >>> segmentShift) & segmentMask, ensures the segment is initialized, and then acquires the segment's lock before performing the standard put logic.

public V put(K key, V value) {
    int hash = hash(key);
    int j = (hash >>> segmentShift) & segmentMask;
    Segment<K,V> s = ensureSegment(j);
    return s.put(key, hash, value, false);
}

Segment locking uses a fast tryLock() path; if it fails, scanAndLockForPut spins, optionally creating a provisional node, and eventually blocks on lock() after a configurable number of retries.

private HashEntry<K,V> scanAndLockForPut(K key, int hash, V value) {
    // spin while trying to acquire the lock, creating a node if needed
    // fall back to blocking lock after MAX_SCAN_RETRIES
}

Resizing a segment doubles its internal table size and redistributes entries, preserving the order of nodes that remain in the same bucket while cloning only the necessary subset of entries.

private void rehash(HashEntry<K,V> node) {
    // double capacity, recompute indices, move entries, and insert the new node
}

Read operations ( get) are lock‑free: they compute the hash, locate the appropriate segment and bucket, and traverse the linked list. Visibility is guaranteed by the use of volatile fields and UNSAFE ordered writes for bucket updates.

public V get(Object key) {
    int h = hash(key);
    Segment<K,V> s = (Segment<K,V>)UNSAFE.getObjectVolatile(segments, u);
    if (s != null && s.table != null) {
        for (HashEntry<K,V> e = entryAt(s.table, indexFor(h, s.table.length)); e != null; e = e.next) {
            if (e.hash == h && (e.key == key || key.equals(e.key)))
                return e.value;
        }
    }
    return null;
}

The article also discusses concurrency pitfalls: how put and remove acquire exclusive segment locks, how lock‑free get can safely read while another thread modifies a bucket, and how the use of UNSAFE.putOrderedObject ensures the newly inserted head is visible to readers.

Overall, the guide provides a detailed, code‑rich exploration of the design choices behind Java’s core map implementations and the mechanisms that make them efficient and thread‑safe.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Java concurrency JDK HashMap concurrenthashmap Data Structures

Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.