Fundamentals 6 min read

How JDK 9’s Compact Strings Slash Java Memory Usage

The article explains how Java's String class was re‑engineered in JDK 9 to use a byte‑array with a coder flag, enabling Latin‑1 encoding, halving memory consumption, reducing GC pauses, and boosting overall application performance.

Senior Brother's Insights
Senior Brother's Insights
Senior Brother's Insights
How JDK 9’s Compact Strings Slash Java Memory Usage

String's Underlying Storage

In Java, the String class is immutable; each modification creates a new object. In JDK 8 and earlier, the internal representation is a char[] array, which occupies two bytes per character.

public final class String implements java.io.Serializable, Comparable<String>, CharSequence {
    private final char value[];
    public String() {
        this.value = "".value;
    }
    public String(String original) {
        this.value = original.value;
        this.hash = original.hash;
    }
    // ...
}

The char[] being final explains why strings are immutable: any change requires a new array and a new String instance.

Storage Optimization in JDK 9

JDK 9 replaces the char[] with a byte[] and adds a coder field to indicate the encoding used (Latin‑1 or UTF‑16).

public final class String implements java.io.Serializable, Comparable<String>, CharSequence {
    @Stable
    private final byte[] value;
    private final byte coder;
    @Native static final byte LATIN1 = 0;
    @Native static final byte UTF16  = 1;
    static final boolean COMPACT_STRINGS;
    public String() {
        this.value = "".value;
        this.coder = "".coder;
    }
    @HotSpotIntrinsicCandidate
    public String(String original) {
        this.value = original.value;
        this.coder = original.coder;
        this.hash = original.hash;
    }
    // ...
}

Most strings in typical applications contain only ASCII characters, which fit in Latin‑1 and therefore require only one byte per character. Using char would waste double the memory. When a string contains characters outside the Latin‑1 range (e.g., Chinese), the JVM falls back to UTF‑16, which uses two bytes per character, matching the old representation.

The coder field records which encoding is in use. The COMPACT_STRINGS flag controls whether this compact representation is enabled; it is on by default and can be disabled with the JVM option -XX:-CompactStrings.

Benefits of the Compact Strings Feature

When a program predominantly uses Latin‑1 characters, memory consumption can be cut roughly in half, allowing the same hardware to handle more workload.

Reduced memory usage leads to fewer garbage‑collection cycles and fewer Stop‑The‑World pauses, which translates into noticeable performance improvements.

Conclusion

As the JVM evolves, the internal structure of String continues to be refined because it is one of the most memory‑intensive classes. The JDK 9 compact‑string redesign demonstrates how a seemingly small change can yield substantial gains in memory efficiency and overall application speed.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavaperformanceMemory OptimizationStringJDK9Compact Strings
Senior Brother's Insights
Written by

Senior Brother's Insights

A public account focused on workplace, career growth, team management, and self-improvement. The author is the writer of books including 'SpringBoot Technology Insider' and 'Drools 8 Rule Engine: Core Technology and Practice'.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.