Master System Performance: Essential Tools & Techniques for Debugging Bottlenecks
This article consolidates practical knowledge on system performance optimization, covering key metrics, load‑testing utilities, Linux monitoring commands, and JVM profiling tricks to help engineers pinpoint and resolve throughput, latency, CPU, disk, and network bottlenecks.
System Performance Definition
Throughput – number of requests the system can handle per second.
Latency – time taken to process a single request.
Usage – overall resource utilization.
Throughput and Latency Relationship
Higher throughput usually leads to higher latency because the system becomes busier.
Lower latency allows higher throughput as the system can process requests faster.
Asynchrony can increase throughput flexibility but does not guarantee lower response time.
Common Load‑Testing Tools
tcpdump
-i : specify interface
-s : capture full packet (default 68 bytes, use -s 0 for full)
-w : write captured packets to fileExamples
tcpdump -i eth1 host 10.1.1.1 // capture all packets on eth1 with source or destination 10.1.1.1
tcpdump -i eth1 src host 10.1.1.1 // source address
tcpdump -i eth1 dst host 10.1.1.1 // destination addressTo analyze with Wireshark, add -s 0 to capture full packets:
tcpdump -i eth0 tcp and port 80 -s 0 -w traffic.pcaptcpcopy – online traffic replay
tcpcopy copies live traffic to a test machine for realistic load testing without deploying new code.
a. Record with tcpdump
tcpdump -i eth0 -w online.pcap tcp and port 80b. Replay traffic
tcpcopy -x 80-10.1.x.x:80 -i traffic.pcap
tcpcopy -x 80-10.1.x.x:80 -a 2 -i traffic.pcap // offline replay at 2× speedc. Traffic diversion modes
tcpcopy -x 80-10.1.x.x:80 -r 20 // divert 20% of traffic
tcpcopy -x 80-10.1.x.x:80 -n 3 // amplify traffic 3×wrk, ApacheBench, JMeter, webbench
wrk is lightweight and accurate; with Lua scripts it supports complex scenarios.
wrk -t4 -c1000 -d30s -T30s --latency http://www.example.comSample output shows latency distribution, requests per second, and transfer rate.
Locating Performance Bottlenecks
Consider four layers:
Application layer
System layer
JVM layer
Profiler tools
Application Layer
QPS
Response time (95th/99th percentile)
Success rate
System Layer
Key resources: CPU, memory, disk, network. A concise command to view overall status:
dstat -lcdngydstat provides real‑time monitoring of CPU, disk, network, I/O, and memory.
CPU
Utilization = 1 – (CPU time used by program / total runtime).
User vs. kernel time indicates compute‑intensive vs. I/O‑intensive workload.
Load average reflects the average number of processes in the run queue; ideal value ≤ number of CPU cores.
Disk
Check space and permissions; insufficient space or rights can cause failures.
du -sh // size of current directory
df -hl // filesystem usageClear large logs quickly:
sudo > /dev/null /var/log/*.log
sudo find /var/log/ -type f -mtime +30 -exec rm -f {} \Test disk speed: dd if=/dev/zero of=output.file bs=10M count=1 Identify I/O bottlenecks with iostat, iotop, and ps:
iostat -x 1
iotop -o
ps -eo state,pid,cmd | grep '^D'Use iostat to view %util, r/s, w/s, and identify busy disks.
Network
Common commands:
netstat -nt // show TCP connections and queues
netstat -nap | grep port // processes using a specific port
netstat -s // summary, useful for detecting retransmissionsTCP state overview (client/server) and typical issues (excess SYN_SENT, large send/receive queues).
JVM Layer
Thread Stack Analysis
Capture thread stacks:
ps -ef | grep java
sudo -u nobody jstack <pid> > /tmp/jstack.<pid>Convert native thread ID (nid) from hex to decimal to match top -H -p <pid> output.
printf "%d" 0x1b40 // decimal
printf "0x%x" 6976 // hexHigh CPU Diagnosis
ps -ef | grep java // find Java PID
top -H -p <pid> // show hottest Java threadsMap the thread ID to the stack using the hex nid from jstack.
GC Cause Inspection
jstat -gccause <pid>This displays garbage‑collection statistics and the reasons for the latest GC events.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Programmer DD
A tinkering programmer and author of "Spring Cloud Microservices in Action"
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
