Fundamentals 14 min read

Why Does Java’s String.hashCode Use 31? Uncovering the Math and Performance Secrets

This article explains why Java’s String.hashCode method multiplies by 31, covering the mathematical derivation, prime‑number benefits, JVM optimization tricks, and experimental results that compare 31 with other multipliers on a large word list.

Programmer DD
Programmer DD
Programmer DD
Why Does Java’s String.hashCode Use 31? Uncovering the Math and Performance Secrets

While exploring the source of String.hashCode(), I noticed the mysterious constant 31 used as a multiplier and set out to understand its purpose.

How String.hashCode() Works

public int hashCode() {
    int h = hash;
    if (h == 0 && value.length > 0) {
        char val[] = value;
        for (int i = 0; i < value.length; i++) {
            h = 31 * h + val[i];
        }
        hash = h;
    }
    return h;
}

The core computation is the loop h = 31 * h + val[i], which can be expressed as the polynomial s[0]*31^(n-1) + s[1]*31^(n-2) + … + s[n-1], where s is the internal char array.

Why 31?

Two main reasons are commonly cited:

31 is an odd prime of moderate size, making it a good hash multiplier that reduces collisions without causing overflow as easily as larger primes.

The JVM can optimize the multiplication: 31 * i == (i << 5) - i, allowing the operation to be performed with a shift and a subtraction.

The value 31 was chosen because it is an odd prime. If it were even and the multiplication overflowed, information would be lost, as multiplication by 2 is equivalent to shifting. A nice property of 31 is that the multiplication can be replaced by a shift and a subtraction for better performance: 31 * i == (i &lt;&lt; 5) - i . Modern VMs do this automatically.

Experiments were conducted on over 230,000 English words from /usr/share/dict/words, testing multipliers such as 2, 3, 17, 31, 37, 41, 101, and others. The conflict rate was measured by counting duplicate hash values.

Experimental Results

Using small primes like 2 produced a conflict rate of 55.14% and most hash values fell into a single partition of the integer range. Multipliers 31, 37, and 41 yielded very low conflict counts (under 7). Larger primes like 101 gave even better distribution but risk integer overflow.

Visualization of hash‑value distribution (partitioned into 64 equal ranges) showed that 2 and 3 concentrate almost entirely in one partition, while 31 spreads more evenly across many partitions, confirming its suitability.

In summary, 31 is a well‑chosen multiplier for String.hashCode() because it balances prime‑based collision reduction with JVM‑friendly arithmetic, making it an effective default for Java’s hash algorithm.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Javaperformancehash functionhashcodeprime multiplier
Programmer DD
Written by

Programmer DD

A tinkering programmer and author of "Spring Cloud Microservices in Action"

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.