Boost Java Microservice Performance with GraalVM Native Image: A Step-by-Step Guide
This tutorial demonstrates how to create a Micronaut microservice, compile it into a GraalVM Native Image, and apply advanced optimizations such as G1 GC, profile‑guided compilation, and UPX packing to achieve faster startup, lower memory usage, and higher throughput in cloud environments.
1. Introduction
GraalVM Native Image can be an attractive platform for Java cloud applications. As I wrote in "GraalVM: Native images in containers", native images pre‑compile your application (AOT), eliminating the need for runtime compilation, so the app starts almost instantly and uses less memory, saving resources used by the JIT compiler and class metadata.
Beyond fast startup, developers use native images for cloud‑friendliness and code obfuscation to improve security.
Figure 1 often appears when discussing performance and the different ways GraalVM can run Java applications; it shows many axes labeled with what people mean by “better performance”.
Sometimes better performance means higher throughput (how many clients a service instance can handle); sometimes it means lower latency for a single response, lower memory usage, faster startup, or smaller deployment size, which can matter for cold‑start scenarios.
With a few simple tricks and advanced GraalVM Native Image features, you can exploit all these advantages for your application.
This article shows how to fully leverage GraalVM Native Image for your application.
2. Create an Application
Assume you have a simple example app: a Micronaut microservice that responds to HTTP queries and computes prime numbers. It uses Java Stream API and creates temporary objects that generate GC pressure while inefficiently checking factors, including even numbers greater than 2.
If you have the Micronaut CLI installed, you can create the app as follows.
<code>mn create-app org.shelajev.primes
cd primes
cat <<'EOF' > src/main/java/org/shelajev/PrimesController.java
package org.shelajev;
import io.micronaut.http.annotation.Controller;
import io.micronaut.http.annotation.*;
import java.util.stream.*;
import java.util.*;
@Controller("/primes")
public class PrimesController {
private Random r = new Random();
@Get("/random/{upperbound}")
public List<Long> random(int upperbound) {
int to = 2 + r.nextInt(upperbound - 2);
int from = 1 + r.nextInt(to - 1);
return primes(from, to);
}
public static boolean isPrime(long n) {
return LongStream.rangeClosed(2, (long) Math.sqrt(n))
.allMatch(i -> n % i != 0);
}
public static List<Long> primes(long min, long max) {
return LongStream.range(min, max)
.filter(PrimesController::isPrime)
.boxed()
.collect(Collectors.toList());
}
}
EOF</code>Now you have the sample app. You can run it or immediately build a native executable.
<code>./gradlew build
./gradlew nativeImage</code>Then run the application.
<code>java -jar build/libs/primes-0.1-all.jar
./build/native-image/application</code>To test, you can open the endpoint in a browser or use the
curlcommand, which returns a prime less than 100.
<code>curl http://localhost:8080/primes/random/100</code>For later stages, download and install
hey, a simple HTTP load generator, and place it in your
$PATH(or obtain the appropriate binary for your OS).
<code>wget https://hey-release.s3.us-east-2.amazonaws.com/hey_linux_amd64
chmod u+x hey_linux_amd64
sudo mv hey_linux_amd64 /usr/local/bin/hey
hey –version</code>Verify it works:
<code>hey -z 15s http://localhost:8080/primes/random/100</code>The output includes a latency distribution and a summary such as:
<code>Summary:
Total: 15.0021 secs
Slowest: 0.1064 secs
Fastest: 0.0001 secs
Average: 0.0015 secs
Requests/sec: 33703.8539
Total data: 20062978 bytes
Size/request: 20 bytes</code>The most important metric is the
Requests/secline, showing throughput. The native image defaults to
-Xmxset to 80 % of available memory; for this test you may want to limit the heap to 512 MB instead of letting it grow indefinitely.
3. Better Memory Management
Reducing runtime memory usage is a key metric, and Native Image improves this compared with a generic JDK.
The savings are mostly one‑time because the native executable contains all compiled code and analyzed classes, eliminating class metadata and JIT infrastructure.
However, the amount of data your application holds in memory is similar, because object layout in the JVM and native image is alike. If your app keeps several gigabytes of data, the native image will use a comparable amount, minus the 200‑300 MB saved by not having JIT and metadata.
Native Image includes a runtime that assumes managed memory and performs garbage collection. The runtime implementation comes from the GraalVM project.
The garbage collector exposes the same options as the JDK, such as
-Xmxfor maximum heap size and
-Xmnfor young generation size. You can also enable
-XX:+PrintGCand
-XX:+VerboseGCfor detailed GC logs.
If you prefer a different collector, you can build the native image with the multithreaded G1 GC, which is a performance‑oriented feature included in GraalVM Enterprise. Enable it by passing
--gc=G1to the native‑image process, e.g., in
build.gradle:
<code>nativeImage {
args("--gc=G1")
}</code>Rebuild the native image after adding the argument.
4. Better Overall Throughput
Throughput is affected by workload characteristics, code quality, data volume, and latency. A better runtime or compiler can significantly speed execution.
GraalVM Enterprise ships with a more powerful compiler that can generate a profile‑guided optimization (PGO) file during AOT compilation, bringing native image throughput closer to a warmed‑up JIT.
To collect a PGO profile, enable
--pgo-instrumentin
build.gradleand build the image normally:
<code>nativeImage {
args("--gc=G1")
args("--pgo-instrument")
}</code>Run the load generator against the instrumented binary; it will produce a
default.iproffile.
Then rebuild the final image using the profile:
<code>nativeImage {
args("--gc=G1")
args("--pgo=../../default.iprof")
}</code>The resulting binary (named
app-ee-pgo) can be compared with other builds.
5. Smaller Binaries
Binary size can be large; you can shrink it.
Without any size optimizations, the example binaries are:
<code>$ ls -lah app*
-rwxrwxr-x. 1 opc opc 58M May 6 20:41 app-ce
-rwxrwxr-x. 1 opc opc 73M May 6 21:14 app-ee
-rwxrwxr-x. 1 opc opc 99M May 6 21:25 app-ee-g1
-rwxrwxr-x. 1 opc opc 80M May 6 21:47 app-ee-pgo
</code>The executable consists of two main parts: the compiled code and the “image heap” created during class initialization.
The code part contains all classes and methods reachable by static analysis or explicit configuration.
The image heap stores the initialized state so that the native image can start instantly.
You can inspect class contributions with the
-H:+DashboardAlloption and refactor accordingly.
Compressing the binary with UPX (e.g.,
upx -7 -k app-ee-pgo) reduces size from ~80 MB to ~23 MB while preserving performance.
6. How Far Can Native Image Take You?
The article demonstrated several optimizations: adaptive G1 GC, profile‑guided compilation, and UPX packing, resulting in a microservice that starts in ~20 ms, occupies ~20 MB, and outperforms OpenJDK on the first 1 M requests.
Running the three 15‑second tests with a 512 MB heap limit yields:
Default native image (app‑ee): 49 791 req/s
With G1 GC (app‑ee‑g1): 51 691 req/s
With G1 + PGO (app‑ee‑pgo): 73 392 req/s
Compared with the same application on OpenJDK 11, the best native image is about 16 % faster.
Overall, native images can match JIT‑based performance while offering faster startup and smaller footprints, making them suitable for constrained environments or microservices.
7. Conclusion
This guide presented various ways to improve native image performance without changing application code: using G1 GC, enabling profile‑guided optimization, and packing with UPX. The resulting microservice starts in ~20 ms, occupies ~20 MB, and delivers higher throughput than the equivalent OpenJDK build.
GraalVM Native Image is an exciting technology for Java workloads in cloud environments, and the techniques shown help you use it more effectively.
Translator’s Note
Hi, I’m Spring‑bro (Mica) and thanks to Zhang Yadong (JustAuth) for helping translate. We have translated several GraalVM and Spring Native articles.
Java Architecture Diary
Committed to sharing original, high‑quality technical articles; no fluff or promotional content.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.