How to Quickly Identify and Fix High CPU Usage on Linux Servers
This guide explains the difference between CPU utilization and load average, walks through essential commands like top, sar, iostat, vmstat, uptime, and ps to pinpoint resource hogs, and offers practical steps to resolve high CPU consumption on Linux systems.
Understanding CPU Utilization vs. Load Average
CPU utilization shows the percentage of CPU time currently in use, while load average indicates how many tasks are running or waiting for CPU time. Knowing both helps determine if a server is truly overloaded.
1. Update System (Optional)
Before troubleshooting, you may want to bring the system up to date.
apt-get update -y dnf update -y yum update -y2. Identify High‑CPU Processes
2.1 Use top
Run top to see a real‑time view of CPU usage. In the interface you can sort by:
Press P to sort by CPU usage.
Press M to sort by memory usage.
Press i to hide idle processes.
Press S to sort by process runtime.
Press U to view processes of a specific user.
2.2 Use sar for historical and real‑time monitoring
Run sar -u 2 to display CPU utilization every 2 seconds.
sar -u 22.3 Use iostat to view CPU and I/O
iostatFor CPU only, add -c:
iostat -c2.4 Use vmstat to see CPU, memory, and wait queues
vmstatRun with an interval for live monitoring, e.g., every 2 seconds:
vmstat 22.5 Use uptime to quickly check system load
uptimeIt shows current time, uptime, logged‑in users, and 1‑, 5‑, 15‑minute load averages.
2.6 Use ps to list top CPU‑consuming processes
ps -eo pcpu,pid,user,args | sort -k 1 -r | head -10This command sorts processes by CPU usage and shows the top ten.
Identify which process is hogging CPU.
Check for runaway loops, zombie processes, or services that constantly restart.
3. Resolving High CPU Usage
3.1 Kill or restart the offending process
kill -9 <PID>Or restart the related service.
3.2 Update system and drivers
apt upgrade yum updateOld software or drivers can cause abnormal CPU spikes.
3.3 Reinstall or downgrade the application
If a bug is suspected, try reinstalling, downgrading, or switching to a stable version.
3.4 Reboot the server (with caution)
rebootUse only when other fixes fail and the environment permits downtime.
4. When No Single Process Is Responsible
If tools like top and ps don’t reveal a culprit, consider:
Insufficient hardware resources.
Multiple applications collectively raising CPU load.
High I/O or memory wait times inflating load average.
Overloaded services such as MySQL or Nginx.
Possible actions:
Separate high‑load services onto dedicated servers.
Upgrade CPU cores, add memory, or move to a higher‑performance instance.
5. Summary of the Troubleshooting Flow
Use top to see current CPU hogs.
Confirm specific processes with ps.
Leverage sar, iostat, or vmstat to detect I/O‑bound or waiting tasks.
Check load averages with uptime.
Apply appropriate remediation: kill processes, update software, reinstall, scale resources, or reboot.
Full-Stack DevOps & Kubernetes
Focused on sharing DevOps, Kubernetes, Linux, Docker, Istio, microservices, Spring Cloud, Python, Go, databases, Nginx, Tomcat, cloud computing, and related technologies.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
