Operations 15 min read

Systematic Approach to Online RSS Memory Usage Troubleshooting and Tool Automation

This article presents a structured method for diagnosing high RSS memory consumption in Java services, detailing step‑by‑step analysis of heap, off‑heap, and native memory, recommending specific commands and tools, and proposing an automated shell‑based workflow to streamline the troubleshooting process.

Ctrip Technology
Ctrip Technology
Ctrip Technology
Systematic Approach to Online RSS Memory Usage Troubleshooting and Tool Automation

Online problem troubleshooting, especially for high RSS (Resident Set Size) usage, is a low‑frequency but critical task that requires an efficient, systematic approach.

The article first explains the importance of a clear diagnostic mindset and lists popular open‑source and commercial tools such as Arthas, Async‑Profiler, JMC, MAT, and commercial profilers like jProfiler and YourKit.

It then introduces a four‑step troubleshooting framework using RSS high memory as an example:

2.1 Heap memory too large – Verify heap usage via jcmd <pid> GC.heap_info and ensure it does not exceed ~75% of physical memory; adjust -Xmx or analyze heap dumps with MAT/JFR if necessary.

2.2 Excessive ARENA regions – Run

sudo -u <user> pmap -x <pid> | sort -gr -k2 | less

to spot many 60‑KB regions and limit them by setting export MALLOC_ARENA_MAX=1.

2.3 Native memory tracking (NMT) overload – Enable NMT with -XX:NativeMemoryTracking=detail, restart the JVM, and query details via jcmd <pid> VM.native_memory detail to identify large JVM internal allocations such as Class, Thread, GC, and Code.

2.4 Off‑heap memory excess – Check DirectByteBuffer and MappedByteBuffer usage via JMX or jmxterm; control it with -XX:MaxDirectMemorySize or employ jemalloc for deeper analysis.

The article then discusses automating these steps into a shell‑based toolchain, outlining the workflow for each sub‑step, required commands (e.g., jps -v, free -m, jstat -gcutil), and how to package the process into a script that can be executed with minimal manual intervention.

Finally, it summarizes that the described methodology and tooling significantly improve troubleshooting efficiency, can be extended to other resource issues (CPU, disk, network, GC), and suggests evolving the solution into a client‑server‑browser system for environments with strict access controls.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

javaJVMperformancetroubleshootingMemory
Ctrip Technology
Written by

Ctrip Technology

Official Ctrip Technology account, sharing and discussing growth.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.