Understanding Java HashMap and ConcurrentHashMap: Structure, Operations, and Performance
This article explains the internal data structures, hashing mechanisms, resizing behavior, and concurrency strategies of Java's HashMap and ConcurrentHashMap, covering their evolution from JDK 1.7 to 1.8, the use of linked lists, red‑black trees, and lock implementations.
1: HashMap Data Structure
HashMap uses a hash table composed of an array of buckets where each bucket holds a singly‑linked list of Node<K,V> entries; when a bucket's list exceeds eight elements it is transformed into a red‑black tree.
2: How HashMap Works
Operations are performed via put and get . When storing a key, the key's hash is computed, the array index is derived, and the entry is placed in the corresponding bucket, handling collisions by chaining or tree conversion.
3: Hash Collisions
If two distinct keys produce the same hash code, a collision occurs; the entries share the same bucket and are linked together (or stored in a tree if the bucket is large).
4: Hash Function Implementation
Since JDK 1.8, the hash is calculated as (h = k.hashCode()) ^ (h >>> 16) , mixing high and low bits to improve distribution and reduce collisions.
5: Why XOR Is Used
The XOR ensures that any single‑bit change in the original 32‑bit hash code results in a different final hash value, minimizing collision probability.
6: Table Capacity, Load Factor, and Resizing
The default table size is 16 with a load factor of 0.75; when the number of entries exceeds capacity * loadFactor (e.g., 12 for the default), the table is resized to double its current length, which can impact performance under heavy load.
7: put() Process
1) Compute the hash and index. 2) If no collision, insert directly; otherwise, append to the list. 3) If the list length exceeds TREEIFY_THRESHOLD (8), convert it to a red‑black tree; if it shrinks below 6, convert back. 4) If the map size exceeds the threshold, trigger resize() .
8: Array Resizing
A new array twice the size of the old one is allocated and each existing node is rehashed to either its original index or index plus the old capacity.
9: Why Use Red‑Black Trees Instead of Binary Search Trees
Red‑black trees provide balanced search performance, avoiding the linear‑time worst case of unbalanced binary trees, while still incurring less overhead than maintaining a tree for small bucket sizes.
10: Red‑Black Tree Basics
Each node is either red or black; the root is black; red nodes cannot have red children; all paths from root to leaves contain the same number of black nodes; leaf nodes are represented by black NIL nodes.
11: Changes in JDK 1.8
1) Buckets with >8 entries become red‑black trees. 2) Insertion order changed from head‑insertion (JDK 1.7) to tail‑insertion. 3) Entry was replaced by Node .
12: HashMap vs. LinkedHashMap vs. TreeMap
HashMap offers fastest basic operations; LinkedHashMap preserves insertion order at a slight performance cost; TreeMap maintains sorted order based on keys.
13: Usage Scenarios
Use HashMap for general key/value storage, TreeMap when ordered traversal is required, and LinkedHashMap when iteration order must match insertion order.
14: HashMap vs. Hashtable
Hashtable is synchronized (thread‑safe) and slower; HashMap allows one null key and multiple null values, starts with capacity 16, and uses the key's hashCode() directly.
15: Thread‑Safe Alternatives
ConcurrentHashMap provides high‑performance thread safety using segment locks (JDK 1.7) or CAS + synchronized (JDK 1.8), unlike Hashtable's single lock.
16: HashMap vs. ConcurrentHashMap
Both share similar structures, but ConcurrentHashMap forbids null keys/values and employs fine‑grained locking for concurrency.
17: Why ConcurrentHashMap Is Faster Than Hashtable
Hashtable uses a single lock for the entire map, causing contention; ConcurrentHashMap reduces contention via segment locks (JDK 1.7) or node‑level locks (JDK 1.8).
18: Lock Mechanism Details (JDK 1.7 vs. JDK 1.8)
JDK 1.7 uses Segment (extends ReentrantLock ) and HashEntry objects; JDK 1.8 replaces segments with an array of Node objects, using CAS and synchronized for updates, and converts long chains to red‑black trees.
19: Why JDK 1.8 Uses synchronized Instead of ReentrantLock
Synchronization granularity is finer, JVM optimizations for synchronized are stronger, and it reduces memory overhead compared to explicit lock objects.
20: Overview of ConcurrentHashMap
Key fields include private transient volatile int sizeCtl which controls initialization and resizing. The map stores data in Node (hash table entry) and TreeNode (red‑black tree node) structures.
21: Concurrency Level
The concurrency level defines the maximum number of threads that can modify the map simultaneously without contention; default is 16 and is rounded up to the nearest power of two.
Architect's Tech Stack
Java backend, microservices, distributed systems, containerized programming, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.