Understanding the Underlying Data Structure of Java HashMap
This article explains the internal architecture of Java's HashMap, covering its array of buckets, linked‑list handling of hash collisions, and the conversion to red‑black trees for high‑collision scenarios, supplemented with code examples and visual diagrams.
HashMap is a hash‑table based data structure in Java whose internal implementation consists of three main components: an array of buckets, linked lists for handling collisions, and, when a bucket exceeds a threshold, a red‑black tree to improve lookup performance.
1. Array
The HashMap maintains an internal Node[] array where each slot (bucket) can hold one or more entries. Each entry is represented by a private inner class that implements Map.Entry and stores the key, value, hash, and a reference to the next node.
static class Node
implements Map.Entry
{
final int hash;
final K key;
V value;
Node
next;
}2. Linked List
When multiple keys produce the same hash, their entries are stored in the same bucket as a singly linked list. The following diagram illustrates the bucket layout with linked‑list entries.
HashMap 底层结构 (数组 + 链表)
--------------------------------
Bucket 0 -> null
Bucket 1 -> Entry1(key1, value1) -> Entry2(key2, value2) -> null
Bucket 2 -> null
Bucket 3 -> Entry3(key3, value3) -> null
Bucket 4 -> null
...3. Red‑Black Tree
If the number of entries in a bucket exceeds a threshold (default 8 in JDK 8+), the linked list is transformed into a red‑black tree, reducing lookup time from O(n) to O(log n). The diagram below shows the array combined with linked lists and red‑black trees.
HashMap 底层结构 (数组 + 链表 + 红黑树)
---------------------------------------
Bucket 0 -> null
Bucket 1 -> Entry1(key1, value1) -> Entry2(key2, value2) -> null
Bucket 2 -> null
Bucket 3 -> TreeNode(key3, value3)
/ \
TreeNode(key4, value4) TreeNode(key5, value5)
Bucket 4 -> null
...When the tree’s size drops below a certain limit, it may be converted back to a linked list. This adaptive design balances memory usage and performance, ensuring fast access for both low and high collision scenarios.
Summary
Array provides O(1) bucket indexing based on the hash of the key.
Linked lists resolve collisions but degrade to O(n) lookup when many keys share a bucket.
Red‑black trees replace long chains, offering O(log n) lookup for heavily colliding buckets.
Mike Chen's Internet Architecture
Over ten years of BAT architecture experience, shared generously!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.