Operations 9 min read

How to Diagnose and Fix 100% CPU on Database and Application Servers

This guide explains how to identify the root causes of a server's CPU hitting 100%—whether on a database or an application server—by using cloud monitoring, Linux top commands, thread analysis with jstack, and practical Java code fixes such as limiting loops, optimizing locks, and handling GC pressure.

Senior Tony

May 29, 2025

How to Diagnose and Fix 100% CPU on Database and Application Servers

Database Server

CPU reaching 100% on a database server is usually caused by one of the following:

Overall QPS/TPS increase.

New slow‑query SQL statements.

Both of the above combined.

Most cloud providers (Alibaba Cloud, Tencent Cloud, AWS, etc.) offer built‑in database monitoring. Check the current QPS/TPS on the monitoring console and compare it with normal periods.

If the cause is (1), simply observe the QPS/TPS trend; a spike indicates the problem.

If the cause is (2), locate the newly added slow queries and verify their execution time and frequency. Rolling back the recent release that introduced the slow SQL is often the fastest remedy.

For (3), apply the solutions for (1) and (2) together.

Application Server

Diagnosing CPU 100% on an application server requires a more detailed, step‑by‑step approach:

Run top to locate the process that consumes the most CPU.

Run top -H -p <PID> to list all threads of that process and find the thread with the highest CPU usage.

Convert the thread ID (TID) to hexadecimal for later use: printf "%x\n" <TID>.

Use jstack <PID> | grep <TID> (or add -A50 to capture surrounding lines) to obtain the stack trace of the hot thread.

After obtaining the stack trace, consider the following typical patterns:

RUNNABLE : likely a compute‑intensive task, an extensive loop/recursion, or even a dead loop.

GC Task Thread : frequent garbage collection; investigate GC logs.

BLOCKED : lock contention.

CPU‑heavy thread constantly changes: possible thread explosion.

Search for the keyword "deadlock" to detect deadlock issues.

Solution Ideas

Compute‑intensive code : Optimize algorithms, reduce data volume, or scale hardware.

Dead loops : Add an interrupt flag, limit maximum iterations, or insert sleeps to lower CPU usage.

// Sample Java loop with interrupt handling and sleep
private volatile boolean running = true;
public void startLoop() {
    new Thread(() -> {
        while (running) {
            processData();
            try {
                Thread.sleep(100);
            } catch (InterruptedException e) {
                Thread.currentThread().interrupt();
                break;
            }
        }
    }).start();
}
public void stopLoop() { running = false; }

Limit maximum iterations :

final int MAX_ITERATIONS = 1000;
int count = 0;
while (count++ < MAX_ITERATIONS && !isTaskDone()) {
    executeStep();
}
if (count >= MAX_ITERATIONS) {
    logger.warn("Loop reached max iterations, exiting");
}

Frequent GC : Adjust heap size, avoid large object allocations, and check for memory leaks.

Lock contention : Reduce lock granularity, use non‑blocking data structures, or apply non‑blocking lock acquisition:

import java.util.concurrent.TimeUnit;
import java.util.concurrent.locks.ReentrantLock;
public class LockExample {
    private final ReentrantLock lock = new ReentrantLock();
    public void execute() throws Exception {
        if (lock.tryLock(100, TimeUnit.MILLISECONDS)) {
            try {
                System.out.println("Lock acquired, work performed here.");
            } finally {
                lock.unlock();
            }
        } else {
            System.out.println("Could not acquire lock, work performed here.");
        }
    }
    public static void main(String[] args) throws Exception {
        new LockExample().execute();
    }
}

Deadlock : Resolve by acquiring multiple locks in a consistent global order.

In many cases, once the root cause is identified—whether a heavy query, a runaway thread, excessive GC, or lock contention—the corresponding mitigation (query throttling, code refactor, hardware scaling, or lock redesign) resolves the 100% CPU issue.

Java lock optimization server performance Thread analysis linux-commands Database Monitoring CPU troubleshooting

Written by

Senior Tony

Former senior tech manager at Meituan, ex‑tech director at New Oriental, with experience at JD.com and Qunar; specializes in Java interview coaching and regularly shares hardcore technical content. Runs a video channel of the same name.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.