Fundamentals 18 min read

Mastering Java HashMap: From Basics to Interview Mastery

This article explains the fundamentals of Java's HashMap, covering its relationship to Set and Map families, internal implementation details, hashCode/equals contracts, collision resolution strategies, basic operations with code examples, and common interview questions such as Top‑K and LRU cache design.

macrozheng

Apr 29, 2020

Mastering Java HashMap: From Basics to Interview Mastery

Preface

HashMap is a data structure that appears frequently in both work and interview settings.

For example, the optimal solution to a variant of LeetCode's first problem "Two Sum" requires a HashMap, and the classic LRU Cache question relies on LinkedHashMap.

This article dives deep into HashMap, guiding you through its origins and helping you ace interview questions.

What is the difference between == and equals()? Why must hashCode() be overridden when equals() is overridden? Differences and relationships among Hashtable, HashSet, and HashMap How are hash collisions handled in Java? Which method is used and why? Implement a HashMap from scratch

The article is organized into the following sections:

Introduction to the Set and Map families

HashMap implementation principles

Discussion of hashCode() and equals()

Detailed analysis of hash collisions

Basic HashMap operations

Analysis of common interview questions

Set Family

Before discussing Map, let's look at Set.

A Set is a collection that does not allow duplicate elements, similar to the mathematical concept.

In Java, Set is an interface in the java.util package, with several implementations:

Common implementations include:

HashSet : Stores elements as keys in a HashMap; unordered with O(1) basic operations.

LinkedHashSet : Combines a HashSet with a linked list to preserve insertion order while retaining O(1) performance.

TreeSet : Uses a red‑black tree to maintain order; slower lookups compared to HashSet.

Map Family

A Map stores key‑value pairs, where keys must be unique.

Key implementations are:

HashMap : Unordered, O(1) average performance.

LinkedHashMap : A HashMap plus a doubly linked list, preserving insertion order.

TreeMap : Ordered implementation based on a binary search tree.

HashMap Implementation Principles

Each key is processed by a hash function to produce a hash value, which determines the bucket index in an internal array (buckets). The index is calculated as hash % array.length and the entry is stored at that position.

If different keys produce the same hash value, how are they stored?

Answer: This is a hash collision , where multiple keys map to the same bucket.

How does HashMap ensure element uniqueness? Could identical elements produce different hash values?

Answer: Uniqueness is guaranteed by the hashCode() and equals() methods.

If the number of pairs grows while the bucket count stays low, what happens?

Answer: Rehashing occurs—when collisions become excessive, the bucket array is resized (typically doubled), changing the index calculation and distributing entries more evenly.

When does rehashing trigger? How is bucket crowding measured?

Answer: By the load factor , defined as number_of_entries / number_of_buckets. Java's default load factor is 0.75f; exceeding this triggers rehashing.

About hashCode() and equals()

If two keys have the same hashCode(), a collision may occur; equals() is then used to determine actual equality.

hashCode() determines the bucket index; equals() compares two objects for equality.

Classic interview question: "Why must you override hashCode() when you override equals()?"

Because if two equal objects have different hash codes, they would be placed in different buckets, breaking the contract between the two methods.

Both methods are originally defined in java.lang.Object. The default equals() compares object references (similar to ==), but can be overridden to compare logical content, as shown with String examples.

str1 = "tianxiaoqi";
str2 = new String("tianxiaoqi");
str1 == str2; // false
str1.equals(str2); // true

Source code snippets from the JDK illustrate that equals() ultimately relies on reference comparison unless overridden.

Hash Collision Details

Two main collision‑resolution strategies exist:

Separate chaining Open addressing

Java uses separate chaining . In JDK 1.6/1.7, each bucket stores a linked list; in JDK 1.8, if a chain exceeds eight nodes, it is transformed into a red‑black tree for faster lookups.

Open addressing (e.g., linear probing) searches sequentially for the next free bucket when a collision occurs.

HashMap Basic Operations

All data structures support the four basic operations—Create, Read, Update, Delete (CRUD). For HashMap:

Add: put(K key, V value) Delete: remove(Object key) Update: also put(K key, V value) Read: get(Object key) / containsKey(Object key)

The put method works as follows:

Compute the bucket index using the hash of the key.

Traverse the linked list (or tree) at that bucket.

If the key exists, update its value; otherwise, insert a new node at the head of the list.

public V put(K key, V value) {
    int index = getIndex(key);
    Node<K, V> node = array[index];
    Node<K, V> head = node;
    while (node != null) {
        // key already exists, update value
        if (checkEquals(key, node)) {
            V prev = node.value;
            node.value = value;
            return prev;
        }
        node = node.next;
    }
    // key not found, insert new node at head
    Node<K, V> newNode = new Node(key, value);
    newNode.next = head;
    array[index] = newNode;
    return null;
}

Further details such as rehashing and load factor can be examined in the source code. Practicing with LeetCode problem 706 ("Design HashMap") is recommended.

Difference from Hashtable

Hashtable is an older, thread‑safe counterpart to HashMap. HashMap is not synchronized, offering better performance in single‑threaded contexts. Additionally, HashMap permits null keys, whereas Hashtable does not.

Top K Problem

A frequently asked interview question is to find the top k most frequent words in a list.

Given a list of words, return the k words with the highest frequency.

Solution outline:

Use a HashMap<String, Integer> to count word frequencies (O(n)).

Maintain a min‑heap of size k to keep the current top‑k entries, ordering first by frequency then lexicographically.

Iterate over the map, inserting into the heap and evicting lower‑frequency entries as needed (overall O(n log k)).

class Solution {
    public List<String> topKFrequent(String[] words, int k) {
        Map<String, Integer> map = new HashMap<>();
        for (String word : words) {
            map.put(word, map.getOrDefault(word, 0) + 1);
        }
        PriorityQueue<Map.Entry<String, Integer>> minHeap = new PriorityQueue<>(k + 1,
            (e1, e2) -> {
                if (e1.getValue().equals(e2.getValue())) {
                    return e2.getKey().compareTo(e1.getKey());
                }
                return e1.getValue().compareTo(e2.getValue());
            });
        for (Map.Entry<String, Integer> entry : map.entrySet()) {
            minHeap.offer(entry);
            if (minHeap.size() > k) {
                minHeap.poll();
            }
        }
        List<String> res = new ArrayList<>();
        while (!minHeap.isEmpty()) {
            res.add(minHeap.poll().getKey());
        }
        Collections.reverse(res);
        return res;
    }
}

LRU Cache

This classic interview problem also relies heavily on HashMap combined with a doubly linked list to achieve O(1) get and put operations.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Java interview HashMap Data Structures Hash Collision

Written by

macrozheng

Dedicated to Java tech sharing and dissecting top open-source projects. Topics include Spring Boot, Spring Cloud, Docker, Kubernetes and more. Author’s GitHub project “mall” has 50K+ stars.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Preface

Set Family

Map Family

HashMap Implementation Principles

About hashCode() and equals()

Hash Collision Details

HashMap Basic Operations

Difference from Hashtable

Top K Problem

LRU Cache

macrozheng

How this landed with the community

Was this worth your time?

0 Comments

Top K Problem