Fundamentals 12 min read

How to Achieve Zero‑Copy String Construction Across JDK Versions

This article explains the internal differences of Java's String implementation from JDK 8 to JDK 9+, demonstrates how to use sun.misc.Unsafe and trusted MethodHandles.Lookup to build zero‑copy String objects, and provides practical code examples for high‑performance string handling.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
How to Achieve Zero‑Copy String Construction Across JDK Versions

1. JDK String Implementation

In JDK 8 a String stores its characters in a char[] value array and copies the array in its public constructor. The class looks like:

class String {
    char[] value;
    // Constructor copies the array
    public String(char[] value) {
        this.value = Arrays.copyOf(value, value.length);
    }
    // Non‑copying constructor used internally
    String(char[] value, boolean share) {
        this.value = value;
    }
}

From JDK 9 onward the representation changes to a byte[] value plus a byte coder field that indicates LATIN1 (0) or UTF‑16 (1). Most strings are LATIN1, allowing a zero‑copy construction for better performance.

class String {
    static final byte LATIN1 = 0;
    static final byte UTF16 = 1;
    byte code;
    byte[] value;
    // Non‑copying constructor used internally
    String(byte[] value, byte coder) {
        this.value = value;
        this.coder = coder;
    }
}

2. Using sun.misc.Unsafe

Unsafe provides low‑level operations that can bypass normal Java safety checks. The following utility obtains the singleton Unsafe instance:

public class UnsafeUtils {
    public static final Unsafe UNSAFE;
    static {
        Unsafe unsafe = null;
        try {
            Field f = Unsafe.class.getDeclaredField("theUnsafe");
            f.setAccessible(true);
            unsafe = (Unsafe) f.get(null);
        } catch (Throwable ignored) {}
        UNSAFE = unsafe;
    }
}

3. Trusted MethodHandles.Lookup

To invoke private constructors or methods, a trusted MethodHandles.Lookup object is required. The code below extracts the internal IMPL_LOOKUP field via Unsafe and creates a lookup that can access any JDK class:

static final MethodHandles.Lookup IMPL_LOOKUP;
static {
    Class<?> lookupClass = MethodHandles.Lookup.class;
    Field f = lookupClass.getDeclaredField("IMPL_LOOKUP");
    long offset = UNSAFE.staticFieldOffset(f);
    IMPL_LOOKUP = (MethodHandles.Lookup) UNSAFE.getObject(lookupClass, offset);
}

public static MethodHandles.Lookup trustedLookup(Class<?> cls) throws Exception {
    return IMPL_LOOKUP.in(cls);
}

4. Zero‑Copy String Construction

Using the trusted lookup, a BiFunction that creates a String without copying can be built for each JDK version.

JDK 8

BiFunction<char[], Boolean, String> STRING_CREATOR_JDK8 =
    (char[] chars, Boolean share) ->
        (String) MethodHandles.lookup()
            .findConstructor(String.class,
                MethodType.methodType(void.class, char[].class, boolean.class))
            .invokeExact(chars, share);

JDK 9‑15

BiFunction<byte[], Byte, String> STRING_CREATOR_JDK11 =
    (byte[] bytes, Byte coder) ->
        (String) MethodHandles.lookup()
            .findConstructor(String.class,
                MethodType.methodType(void.class, byte[].class, byte.class))
            .invokeExact(bytes, coder);

When the JVM is started with -XX:-CompactStrings, these tricks no longer work.

5. Direct Access to String Internals

For JDK 8 the internal char[] value field can be read via Unsafe:

static final Field FIELD_STRING_VALUE;
static final long FIELD_STRING_VALUE_OFFSET;
static {
    Field f = String.class.getDeclaredField("value");
    FIELD_STRING_VALUE_OFFSET = UNSAFE.objectFieldOffset(f);
    FIELD_STRING_VALUE = f;
}

public static char[] getCharArray(String s) {
    try {
        return (char[]) UNSAFE.getObject(s, FIELD_STRING_VALUE_OFFSET);
    } catch (Exception e) {
        return s.toCharArray();
    }
}

For JDK 9+ the coder and value methods are also private; they can be accessed similarly:

MethodHandles.Lookup lookup = trustedLookup(String.class);
MethodHandle coderHandle = lookup.findSpecial(String.class, "coder", MethodType.methodType(byte.class), String.class);
MethodHandle valueHandle = lookup.findSpecial(String.class, "value", MethodType.methodType(byte[].class), String.class);
ToIntFunction<String> STRING_CODER = (String s) -> (byte) coderHandle.invokeExact(s);
Function<String, byte[]> STRING_VALUE = (String s) -> (byte[]) valueHandle.invokeExact(s);

6. Practical Example: Fast Date Formatting

The following method formats a LocalDate to YYYY‑MM‑DD using the zero‑copy creators appropriate for the running JDK:

static String formatYYYYMMDD(LocalDate date) {
    int y = date.getYear();
    int m = date.getMonthValue();
    int d = date.getDayOfMonth();
    if (STRING_CREATOR_JDK11 != null) {
        byte[] bytes = new byte[10];
        bytes[0] = (byte) (y / 1000 + '0');
        bytes[1] = (byte) ((y / 100) % 10 + '0');
        bytes[2] = (byte) ((y / 10) % 10 + '0');
        bytes[3] = (byte) (y % 10 + '0');
        bytes[4] = '-';
        bytes[5] = (byte) (m / 10 + '0');
        bytes[6] = (byte) (m % 10 + '0');
        bytes[7] = '-';
        bytes[8] = (byte) (d / 10 + '0');
        bytes[9] = (byte) (d % 10 + '0');
        return STRING_CREATOR_JDK11.apply(bytes, (byte) 0); // LATIN1
    } else {
        char[] chars = new char[10];
        chars[0] = (char) (y / 1000 + '0');
        chars[1] = (char) ((y / 100) % 10 + '0');
        chars[2] = (char) ((y / 10) % 10 + '0');
        chars[3] = (char) (y % 10 + '0');
        chars[4] = '-';
        chars[5] = (char) (m / 10 + '0');
        chars[6] = (char) (m % 10 + '0');
        chars[7] = '-';
        chars[8] = (char) (d / 10 + '0');
        chars[9] = (char) (d % 10 + '0');
        return STRING_CREATOR_JDK8 != null ? STRING_CREATOR_JDK8.apply(chars, true) : new String(chars);
    }
}

This approach is considerably faster than using SimpleDateFormat or the standard java.time.format.DateTimeFormatter.

7. Caveats

The techniques rely on internal APIs and unsafe operations; they should only be used by experienced developers who understand the risks, as incorrect usage can crash the JVM or break with future JDK releases.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performanceStringunsafeZeroCopyMethodHandles
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.