Operations 11 min read

How to Quickly Diagnose and Fix 100% CPU Usage on Linux Servers

When a Linux server's CPU spikes to 100%, this guide walks you through a systematic investigation—from identifying the high‑load process and pinpointing the offending Java thread to applying a streamlined shell script—so you can resolve the issue and restore normal performance.

Open Source Linux
Open Source Linux
Open Source Linux
How to Quickly Diagnose and Fix 100% CPU Usage on Linux Servers

1. Problem Overview

During a routine operation an alert reported that a data‑platform server’s CPU usage had risen to 98.94% and stayed above 70% for an extended period. Although the service is not CPU‑intensive, the unusually high utilization suggested a code‑level problem rather than a hardware bottleneck.

2. Investigation Steps

2.1 Locate High‑Load Process (PID)

Log into the server and run top to view the current load. By examining the load average (8‑core benchmark) and sorting processes by CPU usage, the process with PID 682 was found to consume a large share of CPU.

2.2 Identify the Business Component

Use pwdx 682 to retrieve the working directory of the process, which reveals that the high‑load process belongs to the data‑platform web service.

2.3 Locate the Problematic Thread and Code Line

The traditional four‑step method involves:

Sorting threads by CPU usage with top -Hp <PID> to obtain the thread ID.

Converting the thread ID to hexadecimal using printf "0x%x" <TID>.

Running jstack <PID> and searching for the hex thread ID.

Because this process is time‑consuming, the show-busy-java-threads.sh script (provided below) automates these steps, quickly revealing the busy Java threads.

3. Root Cause Analysis

The investigation traced the high CPU consumption to a time‑utility method that converts timestamps to formatted dates. This method is invoked repeatedly by the real‑time reporting logic, calculating the number of seconds from midnight to the current time for each query. As the day progresses, the number of calculations grows linearly, leading to massive CPU usage.

Faulty method logic: Converts a timestamp to a date‑time string.

Upper‑level call: Computes seconds for every second of the day and stores results in a set.

Logic layer: Real‑time report queries repeatedly call the method, causing thousands of executions per query.

4. Solution

After identifying the method, the code was simplified to compute only the difference between the current second and midnight, eliminating the unnecessary set construction. The revised implementation reduced the per‑query computation dramatically; after deployment, CPU load dropped by a factor of 30, returning to normal levels.

5. Takeaways

Performance matters as much as functional correctness; efficient implementations are a core engineering skill.

Conduct thorough code reviews and consider alternative, more optimal solutions.

Never overlook small details in production incidents; meticulous investigation leads to faster resolution and continuous improvement.

6. Provided Script: show-busy-java-threads.sh

#!/bin/bash
# @Function
# Find out the highest cpu consumed threads of java, and print the stack of these threads.
# @Usage
#   $ ./show-busy-java-threads.sh
# @author Jerry Lee

readonly PROG=`basename $0`
readonly -a COMMAND_LINE=("$0" "$@")

usage() {
    cat <<EOF
Usage: ${PROG} [OPTION]...
Find out the highest cpu consumed threads of java, and print the stack of these threads.
Example: ${PROG} -c 10

Options:
    -p, --pid       find out the highest cpu consumed threads from the specifed java process,
                    default from all java process.
    -c, --count     set the thread count to show, default is 5
    -h, --help      display this help and exit
EOF
    exit $1
}

readonly ARGS=`getopt -n "${PROG}" -a -o c:p:h -l count:,pid:,help -- "$@"`
[ $? -ne 0 ] && usage 1
eval set -- "${ARGS}"

while true; do
    case "$1" in
        -c|--count)
            count="$2"
            shift 2
            ;;
        -p|--pid)
            pid="$2"
            shift 2
            ;;
        -h|--help)
            usage
            ;;
        --)
            shift
            break
            ;;
    esac
done
count=${count:-5}

redEcho() {
    [ -c /dev/stdout ] && { echo -ne "\033[1;31m"; echo -n "$@"; echo -e "\033[0m"; } || echo "$@"
}

yellowEcho() {
    [ -c /dev/stdout ] && { echo -ne "\033[1;33m"; echo -n "$@"; echo -e "\033[0m"; } || echo "$@"
}

blueEcho() {
    [ -c /dev/stdout ] && { echo -ne "\033[1;36m"; echo -n "$@"; echo -e "\033[0m"; } || echo "$@"
}

# Check the existence of jstack command!
if ! which jstack &>/dev/null; then
    [ -z "$JAVA_HOME" ] && { redEcho "Error: jstack not found on PATH!"; exit 1; }
    ! [ -f "$JAVA_HOME/bin/jstack" ] && { redEcho "Error: jstack not found on PATH and $JAVA_HOME/bin/jstack file does NOT exists!"; exit 1; }
    ! [ -x "$JAVA_HOME/bin/jstack" ] && { redEcho "Error: jstack not found on PATH and $JAVA_HOME/bin/jstack is NOT executalbe!"; exit 1; }
    export PATH="$JAVA_HOME/bin:$PATH"
fi

readonly uuid=`date +%s`_${RANDOM}_$$

cleanupWhenExit() {
    rm /tmp/${uuid}_* &>/dev/null
}
trap "cleanupWhenExit" EXIT

printStackOfThreads() {
    local line
    local count=1
    while IFS=" " read -a line ; do
        local pid=${line[0]}
        local threadId=${line[1]}
        local threadId0x="0x`printf %x ${threadId}`"
        local user=${line[2]}
        local pcpu=${line[4]}
        local jstackFile=/tmp/${uuid}_${pid}
        [ ! -f "${jstackFile}" ] && {
            if [ "${user}" == "${USER}" ]; then
                jstack ${pid} > ${jstackFile}
            else
                if [ $UID == 0 ]; then
                    sudo -u ${user} jstack ${pid} > ${jstackFile}
                else
                    redEcho "[${count}] Fail to jstack Busy(${pcpu}%) thread(${threadId}/${threadId0x}) stack of java process(${pid}) under user(${user})."
                    yellowEcho "    sudo ${COMMAND_LINE[@]}"
                    echo
                    continue
                fi
            fi
        }
        blueEcho "[${count}] Busy(${pcpu}%) thread(${threadId}/${threadId0x}) stack of java process(${pid}) under user(${user}):"
        sed "/nid=${threadId0x} /,/^$/p" -n ${jstackFile}
        count=$((count+1))
    done
}

ps -Leo pid,lwp,user,comm,pcpu --no-headers | {
    [ -z "${pid}" ] && awk '$4=="java"{print $0}' || awk -v pid="${pid}" '$1==pid,$4=="java"{print $0}'
} | sort -k5 -r -n | head -n "${count}" | printStackOfThreads
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavaperformancetroubleshootingCPUshell script
Open Source Linux
Written by

Open Source Linux

Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.