How Xiaohongshu Boosted Java Performance by 10% with a RedJDK Upgrade
Xiaohongshu’s middleware team migrated thousands of Java services from JDK 8 to RedJDK 11/17, achieving over 10% performance gains, 50% GC pause reduction, and eliminating OOM crashes through systematic JDK upgrades, GC tuning, native‑memory improvements, and standardized deployment pipelines.
Background
Java, a mature language with continuous GC, JIT, and core‑library improvements, faces upgrade challenges as newer features like ZGC and virtual threads attract developers but migration risks deter many. Xiaohongshu’s middleware team migrated its Java services from JDK 8 to RedJDK 11/17, completing the upgrade efficiently and stably in complex production environments.
Value Drivers
Upgrading to a newer JDK brings direct cost savings (lower CPU usage, possible downsizing), indirect cost benefits (reduced future resource quotas), meets the needs of modern frameworks (Spring Boot 3.x, Kafka 4.0), improves stability by fixing bugs in GC, JIT, and core libraries, and eases operational overhead through standardized JDK management.
G1GC Optimizations
G1GC became the default GC in JDK 9, but JDK 8’s implementation suffered from inaccurate pause predictions and high Full GC overhead. Upgrading enabled parallel Full GC (JDK 10+), static IHOP tuning, and efficient handling of humongous objects, reducing GC‑related latency.
Parallel FullGC
In JDK 8 FullGC runs single‑threaded, causing second‑level pauses during large events. JDK 10 introduced multi‑threaded FullGC, dramatically lowering pause times.
Static IHOP
InitiatingHeapOccupancyPercent (IHOP) controls when concurrent marking starts; proper tuning prevents long‑running concurrent phases and reduces FullGC frequency.
Humongous Object Handling
Objects larger than half a G1 region are allocated as humongous objects, which can be reclaimed efficiently in Young‑only GCs from JDK 9 onward.
Bug Fixes
Several JVM bugs were addressed:
ReentrantLock may deadlock when a StackOverflowError occurs because the finally block is skipped.
Heapdump and AGCT tools can trigger JVM crashes; upgrading resolves these issues.
G1GC mis‑estimates rs_length, leading to longer-than‑expected STW pauses.
Advanced VM Features
RedJDK 11 supports ZGC and ShenandoahGC (pauseless GC) with sub‑10 ms pauses; later versions (JDK 16+, JDK 21) further improve pause times and throughput. String Compact (JEP‑254) halves memory usage for ASCII strings. JEP‑358 provides helpful NPE messages pinpointing the exact null reference.
Parameter Tuning
Key G1GC parameters (MaxGCPauseMillis, G1NewSizePercent, InitiatingHeapOccupancyPercent, G1UseAdaptiveIHOP, G1ReservePercent) were tuned to eliminate GC jitter. Memory allocation parameters (InitialRAMPercentage, MaxRAMPercentage, ReservedCodeCacheSize, MaxDirectMemorySize) were standardized, and runtime flags (ParallelRefProcEnabled, UseStringDeduplication, UseBiasedLocking) were adjusted for optimal performance.
Native Memory Management
Jemalloc replaced the default allocator, providing 4 KB granularity, reduced fragmentation, timely memory return to the OS, and better diagnostics (jeprof, malloc_stat_print, mallctl). This lowered RSS and eliminated memory‑leak‑related OOMs.
Results
After migration:
Average CPU utilization dropped ~10% while maintaining performance.
Flink and Spark workloads saw 15% and 8.6% compute improvements respectively.
GC STW pauses decreased by 50%, eliminating P99 latency spikes.
OOM and GC‑related crashes were largely eliminated.
Standardized JDK control enables one‑click upgrades to JDK 17, 21, or newer.
Future Plans
Xiaohongshu aims to adopt OpenJDK 21 as the baseline, leveraging virtual threads (JDK 21 production‑ready) and further GC advancements (ZGC, Shenandoah) to boost throughput for high‑concurrency, I/O‑bound services.
Sample SPECjbb2015 Results
Comparisons between JDK 21 and JDK 11 show higher Max‑JOPS, better Critical‑JOPS at various latency targets, and improved P99 throughput.
Virtual Threads
From JDK 21, virtual threads simplify programming and increase throughput by decoupling task scheduling from OS threads, allowing millions of concurrent tasks with few native threads. Ongoing work addresses synchronization bottlenecks (JEP‑491) to fully support virtual threads in production.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Xiaohongshu Tech REDtech
Official account of the Xiaohongshu tech team, sharing tech innovations and problem insights, advancing together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
