Complete Linux Server Performance Tuning Guide: From CPU to Filesystem with Real‑World Cases
This guide walks through diagnosing and tuning CPU, memory, network, disk I/O, and filesystem on Linux servers, showing how to use tools such as mpstat, pidstat, vmstat, ss, iostat, iotop and df, and provides concrete commands, parameter recommendations, and real‑world case studies.
CPU Optimization: Identify the Runaway Core
CPU bottlenecks are either a single process saturating a core or excessive context switches that reduce useful compute time. Start with data collection before terminating processes.
Per‑core view mpstat -P ALL 1 3 The command prints each CPU’s %usr, %sys, %iowait and %idle every second for three intervals. In the example, CPU 0 shows 58 % user and 20 % idle while the other three CPUs stay around 45 % idle, indicating a single‑threaded workload maxing out CPU 0.
Process‑level view pidstat -u 1 2 pidstat lists CPU usage per process. The screenshot shows nginx at 33.5 %, mysqld at 18 % and java at 12 %.
Context‑switch impact vmstat 1 5 Observe the cs column. Normal operation yields a few thousand switches per second; values in the tens or hundreds of thousands indicate rapid thread creation/destruction or severe lock contention. A real case: cs spiked to 200 k because a logging framework created a new thread per log entry; switching to a thread pool reduced cs to 3 k and cut response time from 200 ms to 15 ms.
Typical tuning actions (illustrative, not prescriptive): set CPU affinity with taskset, configure nginx worker_processes to auto, keep database connection pools moderate, disable unnecessary IRQ migration via echo 0 > /proc/irq/N/smp_affinity, and consider the performance CPU governor for high‑I/O workloads.
Memory Optimization: Available Memory Matters
Linux treats idle RAM as waste and uses it for buffers and cache. The key metric is the available column, not used.
free -hThe example shows 15 GB total, 8.2 GB used, 3.1 GB free, but 6.5 GB available. The 4 GB in buff/cache is reclaimable, so dropping caches with echo 3 > /proc/sys/vm/drop_caches only spikes disk I/O without performance gain.
Swap usage signals past memory pressure. The screenshot shows 1.2 GB swap in use, which incurs a steep latency penalty because disk access is orders of magnitude slower than RAM.
Monitor swap activity with vmstat columns si (swap‑in) and so (swap‑out). Persistent non‑zero values indicate frequent swapping and insufficient memory.
Typical adjustments: set vm.swappiness to 10–20 (default 60), configure Java -Xmx / -Xms to avoid exhausting RAM, set MySQL innodb_buffer_pool_size to 60–70 % of physical memory, use cgroups to cap container memory, and enable Transparent Huge Pages (THP) cautiously.
Network Optimization: Connection Count vs. Quality
Common symptoms include connection timeouts, slow responses, and high TCP retransmission rates. Begin with a global view then drill down.
ss -sThe output reports 1 523 TCP connections, 892 established, 128 in TIME_WAIT, and 128 in CLOSE_WAIT. Excessive TIME_WAIT indicates many short‑lived connections; each occupies a local port in the default range 32768‑60999. When the count reaches tens of thousands, new connections may fail due to port exhaustion. sysctl -w net.ipv4.tcp_tw_reuse=1 Enabling tcp_tw_reuse (while keeping tcp_tw_recycle disabled) allows reuse of TIME_WAIT sockets.
A high CLOSE_WAIT count usually means the application has not called close() after the remote side closed the connection.
Production‑tested kernel parameters:
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535
net.ipv4.tcp_fin_timeout = 15
net.ipv4.tcp_keepalive_time = 600
net.ipv4.tcp_keepalive_intvl = 10
net.ipv4.tcp_keepalive_probes = 6
net.core.netdev_max_backlog = 65535
net.ipv4.ip_local_port_range = 1024 65535 somaxconnraises the listen queue length (default 128). tcp_max_syn_backlog expands the half‑open queue, improving SYN handling. Reducing tcp_fin_timeout from 60 s to 15 s frees FIN_WAIT2 sockets faster.
Disk I/O Optimization: Identify and Relieve Bottlenecks
Disk I/O latency dominates overall system speed. Mechanical HDD random latency ≈10 ms, SSD ≈0.1 ms.
iostat -dx 1 3Key columns: %util – device utilization; >80 % signals saturation (example: nvme0n1 at 78.5 %). r_await / w_await – average read/write wait time (ms). sda w_await 8.35 ms is high for HDD; nvme0n1 w_await 1.25 ms is typical for SSD. aqu-sz – average queue length; >1 indicates queued requests (nvme0n1 at 1.05). iotop -b -n1 -o The snapshot shows nginx reading/writing ~10 MiB/s and mysqld writing 15.6 MiB/s. For heavy logging, consider asynchronous logging, lower log level, or moving logs to a dedicated disk.
Typical actions: separate logs and data onto different disks, enable O_DIRECT for databases, select an I/O scheduler ( mq-deadline or none for SSD, bfq or kyber for HDD), raise vm.dirty_ratio and vm.dirty_background_ratio to batch writes, and for write‑intensive workloads consider RAID 10 or NVMe SSD replacement.
Filesystem Optimization: Choose the Right Format
Filesystem type influences baseline performance. ext4 is stable but average; XFS excels with large files and high‑concurrency writes; btrfs/ZFS add snapshots/compression at a performance cost.
df -ThThe example shows /var/log at 77 % usage; a full log partition can block services and even prevent boot.
ext4 reserves 5 % of space for root. On a 500 GB partition, that’s 25 GB. Reduce the reserve when the partition stores only data: tune2fs -m 1 /dev/sdb1 XFS benefits from the noatime mount option, which stops updating access time on reads, eliminating unnecessary metadata writes: mount -o remount,noatime /data Inode exhaustion can occur even with free space. Each file, directory or symlink consumes an inode. Small‑file‑heavy workloads (cache directories, session files) are prone to this. Check inode usage with: df -i If IUse% approaches 100 %, clean up small files. Inode density can be increased at format time with the -i option, but cannot be changed later.
Typical recommendations: use XFS for databases and log partitions, add noatime to all mount options, run e2fsck (ext4) or xfs_scrub (XFS) regularly, consider tmpfs for small‑file‑intensive temporary data, and monitor inode usage with df -i.
Summary
Performance tuning proceeds from measurement to analysis to targeted adjustment. Use mpstat and pidstat for CPU, free and vmstat for memory, ss (or netstat) for network, iostat and iotop for disk, and df plus tune2fs for filesystems. Correlate data across tools to pinpoint the true bottleneck before changing any parameter.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
AI Agent Super App
AI agent applications, installation, large-model testing, computer fundamentals, IT operations and maintenance exchange, network technology exchange, Linux learning
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
