Fundamentals 20 min read

Deep Dive into Java HashMap: Structure, Implementation, and Performance in JDK 1.7 vs 1.8

This article explains Java’s HashMap internals, compares JDK 1.7 and 1.8 implementations, details the bucket array, linked‑list‑to‑red‑black‑tree conversion, resizing logic, performance gains of up to 100 % in 1.8, and warns against using HashMap in concurrent contexts.

Meituan Technology Team

Jun 24, 2016

Deep Dive into Java HashMap: Structure, Implementation, and Performance in JDK 1.7 vs 1.8

HashMap is the most frequently used map implementation for Java developers. With the evolution of the JDK, JDK 1.8 introduced significant optimizations such as the integration of a red‑black tree and improved resizing logic. This article compares JDK 1.7 and JDK 1.8, analyzes the internal structure of HashMap, and explains its core algorithms.

Map hierarchy

Java defines the java.util.Map interface. Its four common implementations are HashMap, Hashtable, LinkedHashMap, and TreeMap. The article briefly describes each: HashMap: stores entries based on the key’s hashCode(), offers fast access but no guaranteed iteration order, allows one null key and multiple null values, and is not thread‑safe. Hashtable: legacy class, thread‑safe, but slower than ConcurrentHashMap. LinkedHashMap: preserves insertion order (or access order when configured). TreeMap: implements SortedMap, orders keys according to their natural ordering or a supplied comparator.

All map keys should be immutable; otherwise, a changed hash code can make the entry unreachable.

Internal implementation

HashMap’s storage consists of an array of buckets, each bucket holding a linked list or, when the list becomes long (default > 8), a red‑black tree. The primary fields are:

Node[] table; // bucket array

int size; // number of key‑value pairs

int threshold; // resize trigger = table.length * loadFactor

float loadFactor = 0.75f; // default

int modCount; // structural modification count for fail‑fast iterators

The bucket array length is always a power of two, enabling the fast index calculation: int index = (hash) & (table.length - 1); The hash function in JDK 1.8 mixes high and low bits:

int h = key.hashCode();
int hash = h ^ (h >>> 16);

Key lookup

To locate a bucket, HashMap first computes the mixed hash, then applies the bit‑mask shown above. Because the array length is a power of two, the mask operation is faster than a modulo.

put(K,V) algorithm

The insertion process follows these steps:

Check if the target bucket table[i] is null; if not, a resize may be required.

Compute the bucket index from the key’s hash.

If the first node in the bucket matches the key (both hash and equals), replace the value.

If the bucket contains a tree node, insert into the red‑black tree.

Otherwise traverse the linked list; if the list length exceeds 8, convert it to a tree, then insert.

After insertion, increment size and, if size > threshold, trigger a resize.

The source code for put in JDK 1.8 is shown in the original article (images).

Resize mechanism

When the number of entries exceeds threshold, HashMap creates a new bucket array with double the capacity. Elements are rehashed and transferred to the new array. Because the new length is also a power of two, each entry either stays in its original index or moves to oldIndex + oldCapacity, which avoids recomputing the full hash.

JDK 1.7 performed a full rehash; JDK 1.8’s design reduces the work dramatically.

Thread‑safety issue

HashMap is not safe for concurrent modifications. The article presents a classic example where two threads simultaneously trigger a resize, leading to a corrupted bucket list that forms a circular linked list, causing an infinite loop during get(). The fix is to use ConcurrentHashMap in multithreaded contexts.

Performance comparison

Benchmarks on a 2.2 GHz i7 show that JDK 1.8 outperforms JDK 1.7 by at least 15 % and up to 100 % for certain map sizes. When the hash distribution is uniform, both versions perform well; with a pathological key that returns the same hash code, JDK 1.8’s red‑black tree conversion keeps the lookup cost near O(log n), whereas JDK 1.7 degrades to O(n).

Conclusion

Resize is costly; estimate map size and set an appropriate initial capacity.

Load factor can be tuned but should be left at the default unless you have a special need.

Never use HashMap in a concurrent environment; prefer ConcurrentHashMap.

JDK 1.8’s red‑black tree integration dramatically improves performance for heavily colliding buckets.

Upgrade to JDK 1.8 to benefit from these optimizations.

References (original Chinese sources and articles are listed in the source).

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Java Performance Concurrency HashMap JDK8 Data Structures algorithms

Written by

Meituan Technology Team

Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.