Operations 11 min read

Diagnosing and Resolving Extreme CPU Usage in a Java Data Platform

When a data platform server suddenly shows CPU utilization near 99% despite modest traffic, this guide walks through pinpointing the offending Java process, tracing the high‑load thread, uncovering a time‑conversion routine that over‑calculates seconds, and applying a lightweight fix that drops CPU load by dozens of times.

Liangxu Linux
Liangxu Linux
Liangxu Linux
Diagnosing and Resolving Extreme CPU Usage in a Java Data Platform

Incident Overview

An operations alert showed that a data‑platform server’s CPU usage spiked to 98.94% and stayed above 70% for an extended period. The application is not CPU‑intensive, so the alert suggested a software‑level problem rather than a hardware bottleneck.

Investigation Steps

2.1 Identify the High‑Load Process (PID)

Log into the host and run top. On an 8‑core machine the load average was high and process 682 showed the largest %CPU value.

2.2 Locate the Business Component

Execute pwdx 682 to reveal the working directory of the process. The path points to the data‑platform web service, confirming the offending service.

2.3 Find the Problematic Thread and Code Line

Typical manual debugging involves four steps:

Sort processes by CPU ( top -b -o %CPU ) and note the PID with the highest load. List Java threads for that PID ( top -Hp <PID> ) and note the thread ID with the highest %CPU . Convert the thread ID to hexadecimal ( printf "0x%x" <tid> ) because jstack reports thread IDs in hex. Run jstack <PID> and search for the hex thread ID to obtain the stack trace.

To avoid the repetitive manual work, the show-busy-java-threads.sh script automates these steps: it extracts the top‑CPU Java threads, generates a temporary jstack dump (using sudo when necessary), and prints the stack for each busy thread.

Root‑Cause Analysis

The stack traces pointed to a utility method that converts a timestamp to a formatted date string. The method was invoked in a loop that enumerates every second from midnight to the current time, stores each formatted string in a Set, and then only uses the set’s size(). In a real‑time reporting query that runs many times per minute, the method is called 10 × 60 × 60 × n times (where n is the number of calls per query). As the day progresses, the number of calls grows linearly, causing the CPU consumption to increase dramatically.

Solution

The unnecessary formatting was removed. Instead of converting each second to a string, the code now computes the difference between the current epoch seconds and midnight’s epoch seconds and uses that integer directly. The new implementation replaces the expensive formatting call, eliminates the creation of the intermediate Set, and returns the integer value. After redeployment, CPU usage dropped by roughly a factor of 30 and returned to normal levels.

Script: show-busy-java-threads.sh

#!/bin/bash
# Find the highest‑CPU Java threads and print their stack traces.

readonly PROG=$(basename $0)
readonly -a COMMAND_LINE=("$0" "$@")

usage(){
cat <<EOF
Usage: ${PROG} [OPTION]...
Options:
  -p,--pid       Specify a Java PID (default: all Java processes)
  -c,--count     Number of top threads to display (default 5)
  -h,--help      Show this help message
EOF
exit $1
}

ARGS=$(getopt -n "${PROG}" -a -o c:p:h -l count:,pid:,help -- "$@")
[ $? -ne 0 ] && usage 1
eval set -- "${ARGS}"
while true; do
  case "$1" in
    -c|--count) count="$2"; shift 2;;
    -p|--pid)   pid="$2";   shift 2;;
    -h|--help)  usage 0;;
    --) shift; break;;
    *) break;;
  esac
done
count=${count:-5}

redEcho(){ [ -c /dev/stdout ] && echo -e "\033[1;31m$@\033[0m" || echo "$@"; }
yellowEcho(){ [ -c /dev/stdout ] && echo -e "\033[1;33m$@\033[0m" || echo "$@"; }
blueEcho(){ [ -c /dev/stdout ] && echo -e "\033[1;36m$@\033[0m" || echo "$@"; }

# Ensure jstack is available
if ! which jstack >/dev/null; then
  [ -z "$JAVA_HOME" ] && { redEcho "Error: jstack not found on PATH!"; exit 1; }
  [ ! -x "$JAVA_HOME/bin/jstack" ] && { redEcho "Error: $JAVA_HOME/bin/jstack not executable!"; exit 1; }
  export PATH="$JAVA_HOME/bin:$PATH"
fi

readonly uuid=$(date +%s)_${RANDOM}_$$
cleanupWhenExit(){ rm -f /tmp/${uuid}_* >/dev/null 2>&1; }
trap "cleanupWhenExit" EXIT

printStackOfThreads(){
  while IFS=" " read -a line; do
    pid=${line[0]}
    threadId=${line[1]}
    threadId0x="0x$(printf %x $threadId)"
    user=${line[2]}
    pcpu=${line[4]}
    jstackFile=/tmp/${uuid}_${pid}
    if [ ! -f "$jstackFile" ]; then
      if [ "$user" = "$USER" ]; then
        jstack $pid > "$jstackFile"
      else
        if [ "$UID" -eq 0 ]; then
          sudo -u $user jstack $pid > "$jstackFile"
        else
          redEcho "[${count}] Fail to jstack busy thread (${pcpu}%) under user $user."
          yellowEcho "    sudo ${COMMAND_LINE[@]}"
          continue
        fi
      fi
    fi
    blueEcho "[${count}] Busy(${pcpu}%) thread(${threadId}/${threadId0x}) stack of java process($pid) under user($user):"
    sed -n "/nid=${threadId0x} /,/^$/p" "$jstackFile"
  done
}

ps -Leo pid,lwp,user,comm,pcpu --no-headers |
  awk '$4=="java"{print}' |
  sort -k5 -r -n |
  head -n ${count} |
  printStackOfThreads

Key Takeaways

Validate CPU spikes with system tools ( top, pwdx) before assuming hardware limits.

Identify the offending process and then drill down to the specific Java thread using jstack or the provided show-busy-java-threads.sh script.

Automating thread‑stack extraction reduces mean‑time‑to‑recovery for production incidents.

Review business logic for unnecessary heavy computations; replace expensive formatting with simple arithmetic when possible.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavaLinuxtroubleshooting
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.