Understanding Java HashMap and ConcurrentHashMap: Structure, Operations, and Performance
This article explains the internal structure of Java's HashMap and ConcurrentHashMap, covering hash calculations, bucket arrays, linked‑list and red‑black tree conversions, load factor and resizing behavior, as well as the differences in concurrency control between JDK 1.7 and 1.8 implementations.
1: HashMap data structure – HashMap uses an array of buckets where each bucket holds a linked list of Node objects; when a bucket’s list exceeds eight elements it is transformed into a red‑black tree. The internal field is declared as transient Node[] table;.
2: How HashMap works – Storing a key/value pair calls hash(K) to compute a hash, combines it with the array length to find an index, and then inserts the entry. If the index already contains a node, a collision occurs and the entry is added to the list (or tree) after checking equals(). Retrieval recomputes the hash and traverses the bucket to find the matching key.
3: Collision handling – Identical hash codes place entries in the same bucket, causing a collision; the bucket stores colliding entries in a linked list or, if the list grows beyond the threshold, in a red‑black tree.
4: Hash implementation – JDK 1.8 improves the hash function to (h = k.hashCode()) ^ (h >>> 16), mixing high and low bits to reduce collisions.
5: Use of XOR – XOR ensures that any change in the 32‑bit hash code produces a different result, minimizing collisions.
6: Table capacity and load factor – The default capacity is 16 (max 1<<30) and the default load factor is 0.75. When the number of entries exceeds capacity * loadFactor, the table is resized (typically doubled), which can impact performance for large data sets.
7: HashMap put process – The put operation computes the hash, determines the bucket index, inserts directly if no collision, otherwise appends to the list or tree, converts the list to a tree when it exceeds the threshold, replaces the value if the key already exists, and triggers a resize when the size surpasses the threshold.
8: Array resizing – Resizing creates a new array twice the size and rehashes existing nodes; each node ends up either at its original index or at index + old capacity.
9: Why red‑black trees instead of binary search trees – Red‑black trees remain balanced, avoiding the linear‑time worst case of unbalanced binary search trees, while still being faster than deep linked lists for large buckets.
10: Red‑black tree properties – Nodes are red or black, the root is black, red nodes have black children, all leaves are black NIL nodes, and every path from root to leaf contains the same number of black nodes.
11: Changes in JDK 1.8 – Chains longer than eight become red‑black trees, insertion moves to the tail of the list, and Entry is replaced by Node.
12: HashMap vs LinkedHashMap vs TreeMap – HashMap offers fast unordered access; LinkedHashMap preserves insertion order at a slight performance cost; TreeMap maintains sorted order based on keys.
13: Typical usage scenarios – Use HashMap for general key/value storage, TreeMap when ordered traversal is required, and LinkedHashMap when iteration order must match insertion order.
14: HashMap vs Hashtable – Hashtable is synchronized (thread‑safe) and disallows null keys/values, while HashMap is unsynchronized, allows one null key and multiple null values, and has different default capacities and resizing strategies.
15: Thread‑safe alternative – ConcurrentHashMap – Provides high‑performance concurrency; unlike Hashtable it uses finer‑grained locking (segment locks in JDK 1.7, CAS + synchronized in JDK 1.8).
16: Differences between HashMap and ConcurrentHashMap – ConcurrentHashMap forbids null keys/values and employs lock‑striping or CAS for thread safety; HashMap allows nulls and is not thread‑safe.
17: Why ConcurrentHashMap outperforms Hashtable – It avoids a single global lock, reducing contention and improving scalability.
18: ConcurrentHashMap lock mechanisms (JDK 1.7 vs 1.8) – JDK 1.7 uses segmented ReentrantLock per bucket group; JDK 1.8 replaces segments with Node structures, using CAS for updates and synchronized blocks only when necessary, and converts long chains to red‑black trees.
19: Use of synchronized in JDK 1.8 – Synchronized offers better JVM‑level optimizations and lower memory overhead compared to explicit ReentrantLock under heavy load.
20: Key fields of ConcurrentHashMap – private transient volatile int sizeCtl; controls initialization and resizing; the core storage units are Node, TreeNode, and TreeBin.
21: Concurrency level – Defines the maximum number of threads that can update the map without contention; default is 16 and is rounded up to the nearest power of two.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architect's Tech Stack
Java backend, microservices, distributed systems, containerized programming, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
