Overview of Java Collection Framework and Core Concepts
This article provides a comprehensive overview of Java collections, covering the definition, characteristics, differences from arrays, advantages, common collection classes, underlying data structures, fail‑fast mechanism, detailed List, Set, Queue, and Map interfaces, as well as HashMap and ConcurrentHashMap implementations and best practices.
Collection Container Overview
What is a Collection?
A collection is a container used to store data.
The collection framework defines a unified architecture for representing and operating on collections, consisting of three major parts: external interfaces, interface implementations, and algorithms for collection operations.
Characteristics of Collections
Encapsulate data; multiple objects often need a collection for storage.
Collections have variable length, unlike fixed‑size arrays.
Difference Between Collections and Arrays
Arrays have fixed length; collections are dynamically sized.
Arrays can store primitive types and reference types; collections store only reference types.
All elements in an array must be of the same type; a collection can hold objects of different types.
Each collection type has its own internal data structure, forming a hierarchy of collection systems.
Advantages of the Collection Framework
Automatic capacity growth.
High‑performance data structures and algorithms.
Easy extensibility and code reuse.
Using JDK‑provided collection classes reduces learning cost and maintenance effort.
Common Collection Classes
Java containers are divided into Collection and Map . The Collection sub‑interfaces include Set , List , and Queue . The most frequently used are Set and List; Map is not a sub‑interface of Collection.
List : ordered container, allows duplicates and nulls, implementations include ArrayList, LinkedList, Vector.
Set : unordered container, no duplicates, allows at most one null, implementations include HashSet, LinkedHashSet, TreeSet.
Map : key‑value pair container, keys are unique, values may duplicate; common implementations are HashMap, TreeMap, Hashtable, LinkedHashMap, ConcurrentHashMap.
Underlying Data Structures
Collection
List ArrayList – backed by an Object array. Vector – backed by an Object array. LinkedList – doubly‑linked circular list.
Set HashSet – based on HashMap, stores elements as keys with a constant dummy value. LinkedHashSet – extends HashSet, maintains insertion order via a linked list. TreeSet – ordered, unique elements stored in a red‑black tree.
Map
HashMap : before JDK 1.8, an array + linked list; from JDK 1.8, array + linked list + red‑black tree when a bucket exceeds a threshold.
LinkedHashMap : extends HashMap, adds a doubly‑linked list to preserve insertion or access order.
Hashtable : array + linked list, synchronized.
TreeMap : red‑black tree implementation.
Fail‑Fast Mechanism
When multiple threads modify a collection structurally during iteration, the iterator detects a change in the internal modCount and throws ConcurrentModificationException . Solutions include synchronizing modifications or using CopyOnWriteArrayList .
List Interface
What is an Iterator?
The Iterator interface provides a way to traverse any Collection, allowing element removal during iteration.
List<String> list = new ArrayList<>();
Iterator<String> it = list.iterator();
while (it.hasNext()) {
String obj = it.next();
System.out.println(obj);
}How to Remove Elements While Traversing?
Iterator<Integer> it = list.iterator();
while (it.hasNext()) {
// do something
it.remove();
}A common mistake is using a for‑each loop and calling list.remove(i) , which triggers ConcurrentModificationException .
Array ↔ List Conversion
Array to List: Arrays.asList(array) .
List to Array: list.toArray() .
ArrayList vs LinkedList
ArrayList uses a dynamic array; LinkedList uses a doubly‑linked list.
Random access is faster in ArrayList; insert/delete in the middle is faster in LinkedList.
LinkedList consumes more memory due to node overhead.
Both are not thread‑safe.
Use ArrayList for frequent reads, LinkedList for frequent insertions/deletions.
Why is elementData in ArrayList marked transient ?
ArrayList implements Serializable . The transient modifier prevents the internal array from being serialized directly; only the actual stored elements are written, reducing size and improving speed.
CopyOnWriteArrayList
A thread‑safe variant of ArrayList that copies the underlying array on each write, suitable for read‑heavy, write‑light scenarios.
List, Set, Map Differences
List : ordered, allows duplicates.
Set : unordered, no duplicates.
Map : key‑value pairs, keys unique.
Set Interface
HashSet Implementation
HashSet is a thin wrapper around a HashMap . Uniqueness relies on proper hashCode() and equals() implementations.
Queue Interface
What is BlockingQueue?
java.util.concurrent.BlockingQueue blocks on retrieval when empty and blocks on insertion when full, supporting producer‑consumer patterns. Implementations include ArrayBlockingQueue , LinkedBlockingQueue , PriorityBlockingQueue , SynchronousQueue , etc.
Difference Between poll() and remove()
Both return and remove the head element.
poll() returns null if the queue is empty; remove() throws NoSuchElementException .
Map Interface
The overall structure of a Map is illustrated below:
HashMap Implementation
In JDK 1.7, HashMap uses an array + linked list. Since JDK 1.8, when a bucket’s chain exceeds 8 nodes, it is transformed into a red‑black tree.
Key insertion steps:
Compute the hash of the key.
Determine the bucket index.
If the bucket is empty, create a new node.
If the bucket contains a node with the same key, replace the value.
If a collision occurs, add the node to the linked list or tree.
After insertion, check if resizing is needed.
JDK 1.7 vs JDK 1.8
Resize optimization.
Introduction of red‑black trees to limit chain length.
Improved handling of concurrent resize loops (still not thread‑safe).
Aspect
JDK 1.7
JDK 1.8
Storage structure
Array + Linked List
Array + Linked List + Red‑Black Tree
Initialization
Separate
inflateTable()Integrated into
resize()Hash calculation
9 perturbations
2 perturbations
Insertion method
Head insertion
Tail insertion
How HashMap Avoids Collisions
HashMap applies a perturbation function: h ^ (h >>> 16) , mixing high and low bits to produce a more uniform distribution.
static final int hash(Object key) {
int h;
return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}HashMap put Process
The put method computes the key’s hash, finds the bucket, handles existing keys, converts long chains to trees, and triggers resize when the load factor is exceeded.
Resize Operation
Resize doubles the capacity; in JDK 1.8 the new position is either the original index or original index + old capacity, avoiding full rehash.
Can Any Class Be Used as a Key?
Yes, but the class should override equals() and hashCode() , preferably be immutable to ensure stable hash values.
Why String and Integer Are Good Keys
They are final and immutable, and provide reliable equals() and hashCode() implementations.
Why HashMap Does Not Use hashCode() Directly as Index
Because hashCode() can produce values outside the array bounds; the hash is further processed and masked with length‑1 (length is a power of two) to obtain a valid index.
Why HashMap Length Is a Power of Two
Using a power‑of‑two length allows the index calculation to be performed with a fast bitwise AND operation instead of modulo.
HashMap vs ConcurrentHashMap
ConcurrentHashMap partitions the bucket array into segments (JDK 1.7) or uses fine‑grained locking with CAS (JDK 1.8), providing higher concurrency.
ConcurrentHashMap does not allow null keys or values.
ConcurrentHashMap Implementation (JDK 1.7)
Data is divided into Segment objects, each protecting a portion of the hash table with its own lock.
ConcurrentHashMap Implementation (JDK 1.8)
Segment design is removed; concurrency is achieved with Node level CAS, synchronized on the first node of a bin, and volatile fields for visibility.
Common Pitfalls with ConcurrentHashMap
Methods like containsKey followed by put are not atomic; use computeIfAbsent or LongAdder for thread‑safe counting.
// Thread‑safe counting example
ConcurrentHashMap
freqs = new ConcurrentHashMap<>();
freqs.computeIfAbsent(key, k -> new LongAdder()).increment();Recommended Reading
High‑Frequency Java Fundamentals (Core Volume 1)
MySQL Essentials
Redis Caching Insights
Follow the public WeChat account Internet Full‑Stack Architecture for more valuable information.
Full-Stack Internet Architecture
Introducing full-stack Internet architecture technologies centered on Java
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.