Comprehensive Guide to Diagnosing Online Failures: CPU, Memory, Disk, GC, and Network Issues
This article provides a step‑by‑step methodology for troubleshooting online service failures by systematically examining CPU, disk, memory (including heap, OOM, stack overflow, and off‑heap), garbage collection, and network problems using tools such as ps, top, jstack, jmap, jstat, iostat, vmstat, strace, and tcpdump.
