Understanding HashMap and ConcurrentHashMap in Java (JDK 1.7 and 1.8)
This article explains the internal structures and core methods of Java's HashMap and ConcurrentHashMap across JDK 1.7 and 1.8, detailing array‑linked list implementations, load factor handling, resizing, treeification, concurrency mechanisms, and performance considerations with extensive code examples.
Introduction
Key‑value containers such as Map are fundamental in software development. Before diving into the concurrent version, it is useful to review the classic HashMap implementation.
HashMap (JDK 1.7)
In JDK 1.7 the underlying structure is an array + linked list. The main fields include the default bucket size, maximum capacity, load factor, the table array that stores entries, and the current size.
Key member variables:
Initial bucket size (array length).
Maximum bucket count.
Default load factor (0.75). table – the actual array that holds data. size – number of stored entries.
When the number of entries reaches capacity * loadFactor (e.g., 16 × 0.75 = 12), the map expands, which is costly because it requires re‑hashing and copying.
Entry internal class stores key, value, hash, next. The put method performs the following steps:
public V put(K key, V value) {
if (table == EMPTY_TABLE) {
inflateTable(threshold);
}
if (key == null)
return putForNullKey(value);
int hash = hash(key);
int i = indexFor(hash, table.length);
for (Entry<K,V> e = table[i]; e != null; e = e.next) {
if (e.hash == hash && (e.key == key || key.equals(e.key))) {
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
modCount++;
addEntry(hash, key, value, i);
return null;
}The get method locates the bucket, traverses the linked list, and returns the matching value or null if not found.
HashMap (JDK 1.8)
JDK 1.8 introduces treeification: when a bucket’s linked list exceeds TREEIFY_THRESHOLD (default 8), it is converted into a red‑black tree, reducing lookup complexity from O(N) to O(log N).
Key new fields include DEFAULT_INITIAL_CAPACITY, MAXIMUM_CAPACITY, DEFAULT_LOAD_FACTOR, and TREEIFY_THRESHOLD. The internal node class is now Node<K,V> but still stores key, value, hash, next.
Put operation (simplified):
public V put(K key, V value) {
if (value == null) throw new NullPointerException();
int hash = hash(key);
int i = (hash >>> segmentShift) & segmentMask;
Segment<K,V> s = (Segment<K,V>)UNSAFE.getObject(segments, (i << SSHIFT) + SBASE);
if (s == null) s = ensureSegment(i);
return s.put(key, hash, value, false);
}The get method is lock‑free and reads the volatile value directly.
ConcurrentHashMap (JDK 1.7)
JDK 1.7 uses a segmented lock design. The map consists of an array of Segment<K,V>, each extending ReentrantLock. A segment contains a volatile HashEntry<K,V>[] table, count, threshold, and load factor.
Put operation acquires the segment lock, then inserts or updates an entry in the segment’s table. The code ensures visibility with volatile fields but still requires locking for atomicity.
public V put(K key, V value) {
if (value == null) throw new NullPointerException();
int hash = hash(key);
int j = (hash >>> segmentShift) & segmentMask;
Segment<K,V> s = (Segment<K,V>)UNSAFE.getObject(segments, (j << SSHIFT) + SBASE);
if (s == null) s = ensureSegment(j);
return s.put(key, hash, value, false);
}Get operation reads the segment and entry without acquiring any lock, relying on the volatile value for visibility.
ConcurrentHashMap (JDK 1.8)
JDK 1.8 removes the segment array and replaces it with a single Node<K,V>[] table. Concurrency is achieved using CAS (compare‑and‑swap) combined with synchronized blocks for updates, while reads remain lock‑free.
Put flow:
Compute hash and locate the bucket.
If the bucket is empty, attempt to insert with CAS; on failure spin‑retry.
If the bucket contains a node with hash == MOVED, trigger resizing.
If the bucket holds a linked list, lock the bucket (via synchronized) and insert a new node.
If the list size exceeds TREEIFY_THRESHOLD, convert it to a red‑black tree.
Get flow simply hashes the key, reads the appropriate bucket, and traverses either a linked list, a tree, or returns the value directly.
Comparison and Takeaways
Both JDK 1.7 and 1.8 HashMap implementations share the same basic array‑linked list structure, but 1.8 adds treeification to improve worst‑case lookup time. ConcurrentHashMap evolves from a segmented lock model (1.7) to a CAS‑based, synchronized bucket model (1.8), eliminating the need for many locks and achieving higher concurrency.
Understanding these internals is valuable for interview preparation, performance tuning, and designing thread‑safe caches such as Guava’s Cache.
Conclusion
Mastering the differences between HashMap and ConcurrentHashMap across JDK versions equips developers with the knowledge to choose the right data structure, avoid concurrency pitfalls, and write high‑performance Java code.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java Captain
Focused on Java technologies: SSM, the Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading; occasionally covers DevOps tools like Jenkins, Nexus, Docker, ELK; shares practical tech insights and is dedicated to full‑stack Java development.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
