Why Did My CPU Hit 100%? Uncovering Hidden Linux Processes & Nice Value
An Alibaba tech support engineer investigates a server whose CPU appeared fully utilized, discovers the misleading 'ni' metric, reproduces the issue with high‑nice loops, uncovers hidden mining processes via ld.so.preload, and explains how to detect and mitigate such stealthy CPU hogs.
Case Overview
An Alibaba technical support engineer was asked to explain why a customer's server showed 100% CPU usage according to top. The top display lists eight CPU metrics: us, sy, ni, id, wa, hi, si, and st. The sum of all eight should be 100%.
The ni metric (high‑nice user‑space processes) was the dominant value, while id and wa (idle times) summed to zero, indicating a fully busy CPU.
Understanding the ni Metric
In Linux, processes have a nice value that determines their scheduling priority. A higher nice value means lower priority. The ni column aggregates CPU usage of all processes whose nice value is greater than 0.
Reproducing the Symptom
The engineer wrote a simple infinite loop program and examined its compiled assembly:
00000000004004ed <main>:
4004ed: 55 push %rbp
4004ee: 48 89 e5 mov %rsp,%rbp
4004f1: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp)
4004f8: 83 45 fc 01 addl $0x1,-0x4(%rbp)
4004fc: eb fa jmp 4004f8 <main+0xb>
4004fe: 66 90 xchg %ax,%axRunning this program with a high nice value (e.g., 19) on a 16‑core server shows the process consuming a large portion of the ni column, while the total CPU usage reported by top remains 100%.
The per‑core breakdown confirms that the CPU is saturated even though the sum of process usages appears lower than the theoretical maximum (1600%).
Client Disagreement and Further Investigation
The client insisted the issue persisted before a reboot, demanding proof of the CPU hog. After the server rebooted, the symptom vanished, but the client remained unsatisfied.
Discovering Hidden Mining Processes
Using system logs, the engineer pinpointed the start time of the CPU spike (April 29, 06:40). Two configuration files created a minute earlier referenced suspicious libraries libxmr‑stak‑c.a and libxmr‑stak‑backend.a, known components of the Monero mining software.
Using ld.so.preload to Hide Processes
The engineer found that at 06:39 a library was added to ld.so.preload. This forces the library to be loaded before all others, allowing it to hook functions like readdir and filter /proc entries. Consequently, top and ps displayed sanitized process lists, hiding the mining processes.
Evidence and Mitigation
Reading /proc/<pid>/maps of a bash process showed the injected libjdk library present in every process address space. The library’s strings revealed hooks for directory reading, confirming the stealth technique.
Takeaways
The case illustrates how missing reproducibility and a demanding client can complicate troubleshooting, but systematic analysis—examining CPU metrics, reproducing with controlled workloads, and inspecting preload hooks—can uncover sophisticated hidden workloads.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
