Mastering Java HashMap & ConcurrentHashMap: Internals, Performance, and Best Practices
This article explains the internal structure and working principles of Java's HashMap and ConcurrentHashMap, covering hash calculations, collision handling, resizing, the transition to red‑black trees, differences with TreeMap, LinkedHashMap, Hashtable, and performance‑related lock mechanisms in JDK 7 and JDK 8.
1. HashMap data structure
HashMap uses a hash table composed of an array of singly linked lists (or trees) – an array of Node<K,V>[] where each bucket holds a linked list of entries. When a bucket's list exceeds 8 elements it is converted into a red‑black tree.
transient Node<K,V>[] table;2. How HashMap works
When put(K key, V value) is called, the key's hash is computed, combined with the array length to find the bucket index, and then:
If the bucket is empty, the entry is inserted directly.
If a collision occurs, the entry is appended to the list (JDK 1.8 uses tail insertion; JDK 1.7 used head insertion).
If the list length exceeds TREEIFY_THRESHOLD = 8, it is transformed into a red‑black tree.
When get(K key) is called, the hash is recomputed, the bucket is located, and the list (or tree) is traversed using equals() to find the matching node.
3. hashCode collisions
Identical hashCode() values do not guarantee equality; they cause bucket index collisions, which are resolved by the linked‑list or tree structure.
4. Hash implementation in JDK 8
JDK 8 improves the hash function by XOR‑ing the high 16 bits with the low 16 bits: (h = k.hashCode()) ^ (h >>> 16), reducing collisions and improving distribution.
5. Why XOR is used
XOR ensures that any change in the 32‑bit hash value affects the final result, minimizing the chance of collisions.
6. Table capacity, load factor, and resizing
The default table size is 16 (configurable up to 1 << 30). The load factor (default 0.75) determines when resizing occurs: when the number of entries exceeds capacity * loadFactor, the table is doubled. Resizing is costly and can impact performance.
7. put() process summary
Compute hash, locate bucket, handle collisions (list or tree), possibly resize, and finally update the size counter.
8. Array resizing process
A new array twice the size of the old one is allocated; each existing node is rehashed to either stay at its original index or move to index + oldCapacity.
9. Why red‑black tree instead of binary search tree
Binary search trees can degenerate into linear structures in worst‑case scenarios, leading to poor performance. Red‑black trees remain balanced, providing O(log n) lookup while incurring acceptable overhead.
10. Red‑black tree properties
Each node is either red or black.
The root is always black.
Red nodes cannot have red children.
All leaves (NIL nodes) are black.
Every path from a node to its descendant leaves contains the same number of black nodes.
11. HashMap vs TreeMap vs LinkedHashMap
HashMap offers constant‑time insert, delete, and lookup. TreeMap maintains sorted order of keys (default ascending). LinkedHashMap preserves insertion order, at the cost of slower iteration.
12. HashMap vs Hashtable
Hashtable is synchronized (thread‑safe) and slower; it does not allow null keys or values. HashMap is unsynchronized, permits one null key and multiple null values, and uses a different resizing strategy.
13. ConcurrentHashMap overview
ConcurrentHashMap provides a thread‑safe, high‑performance map. Key differences from Hashtable include lock‑striping (JDK 1.7) and CAS‑based lock‑free updates (JDK 1.8). It disallows null keys/values.
14. Important fields
private transient volatile int sizeCtl;controls initialization and resizing state.
15. Data structures in ConcurrentHashMap
Nodes store key‑value pairs; TreeNode extends Node for red‑black tree storage; TreeBin wraps TreeNode and manages tree conversion and locking.
16. put() in ConcurrentHashMap
If the map is uninitialized, initTable() runs. Without hash collisions, CAS inserts the node. On collision, the bucket is locked; if it is a list, it is traversed; if it is a tree, tree insertion rules apply. When a list exceeds the threshold, it is transformed into a red‑black tree.
17. get() in ConcurrentHashMap
Compute the hash, locate the bucket, return the first matching node; if resizing is in progress, the forwarding node redirects the lookup.
18. Concurrency level
The maximum number of threads that can update the map without contention; default is 16 and is rounded up to the nearest power of two.
19. Lock mechanisms (JDK 7 vs JDK 8)
JDK 7 uses segment‑level ReentrantLock (one lock per bucket group). JDK 8 replaces segments with CAS and synchronized blocks on individual nodes, reducing lock granularity and memory overhead.
20. Why synchronized replaces ReentrantLock in JDK 8
Synchronized offers better JVM‑level optimizations, lower memory cost, and sufficient performance for high‑throughput scenarios.
21. Visual illustrations
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Programmer DD
A tinkering programmer and author of "Spring Cloud Microservices in Action"
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
