Operations 17 min read

Essential Linux Administration Tools for Troubleshooting and Monitoring

This guide compiles a comprehensive set of Linux command‑line utilities—including shell quoting rules, environment handling, text processing, process management, system monitoring, networking, and /proc filesystem exploration—to help sysadmins diagnose, locate, and resolve a wide range of system issues efficiently.

Full-Stack DevOps & Kubernetes
Full-Stack DevOps & Kubernetes
Full-Stack DevOps & Kubernetes
Essential Linux Administration Tools for Troubleshooting and Monitoring

This article gathers a personal collection of Linux operation tools useful for diagnosing, locating, and handling various problems such as network connection analysis and high‑memory process identification.

Shell Script Basics

Quotes: Single quotes enforce strong quoting (no expansion). Double quotes enable weak quoting, allowing variable expansion ( $VAR), command substitution ( $(cmd)), and escaped characters.

Script Path: Retrieve full script path with FILEPATH="$(readlink -f $0)" and directory with BASEDIR="$(dirname $(readlink -f $0))".

Environment Variables: Set a variable for the current process ( ENV=debug) or export it to child processes ( export ENV=debug).

Text Processing

sed 's/old/new/'

replaces text.

Pass external variables to sed and awk using shell expansion, e.g., x=MM sed 's/AB/'$x'/g' filename or awk -v A=$a -v B=$b 'BEGIN{printf("%d,%d\n",A,B)}'.

Floating‑point calculations with bc ( echo "scale=2;$a/$b" | bc) or awk ( awk 'BEGIN{printf("%.2f", $a/$b)}').

String operations: trim spaces, change case ( ${var,,}, ${var^^}), case‑insensitive compare, length ( ${#str}), iterate characters, numeric check via expr.

Search Operations

OR/AND regex with grep -E 'abc|xyz' or grep 'abc.*xyz'.

Process lookup using ps aux, ps auxww, ps -o ppid= -p PID, tree view ( ps f), and filtering with grep.

Awk Examples

Extract IP and port:

node="127.0.0.1:2019"; eval $(echo "$node" | awk -F[:.] '{printf("ip=%s
port=%s
",$1,$2)}')

.

Get IP of a specific NIC via netstat -ie | awk ....

Log Rotation

Use the built‑in logrotate command with configuration in /etc/logrotate.conf (global defaults) and per‑service files in /etc/logrotate.d/. Example for Redis logs:

# cat /etc/logrotate.d/redis
/usr/local/redis/log/redis-*.log {
    rotate 2
    minsize 100M
    nocompress
    missingok
    create 0664 redis redis
    notifempty
}

Device and System Tools

Hardware info: lspci | grep -i ethernet, dmidecode, lscpu, lsscsi.

Systemd utilities: systemctl poweroff, systemctl reboot, systemctl rescue, systemd-analyze, loginctl list-users, systemctl status, systemctl list-dependencies SERVICE.

Journal logs: journalctl, journalctl -k -o json --no-pager, journalctl -b, time‑range queries, PID or UID filtering.

Service restart differences: legacy service nginx restart vs. modern systemctl restart nginx.

Performance Monitoring

sar

(System Activity Reporter) for comprehensive system metrics. vmstat 2 5 for virtual memory snapshots. iostat -m for CPU/disk I/O, iostat -xN sda for per‑device stats. mpstat 2 or mpstat -P 0 2 5 for CPU usage.

Interrupt inspection via cat /proc/interrupts and affinity files.

Open file descriptors with lsof (e.g., lsof -i :PORT, lsof -p PID).

Process ownership of sockets with fuser -v -n tcp 2019.

Network Utilities

Replace netstat with ss for socket statistics.

IP management: ip addr add, ip addr del, ip route show.

Traffic capture with tcpdump (list interfaces, capture on interface, filter by host/port, save to file, read from file).

Bandwidth monitoring with iftop.

Port forwarding and multiplexing with socat (Socket CAT).

/proc Filesystem

Memory info: /proc/meminfo.

CPU info: /proc/cpuinfo.

Process details: /proc/PID/ (maps, fd, etc.).

IRQ handling: /proc/irq/ directories and smp_affinity files.

Network stats: /proc/net/dev, /proc/net/sockstat.

System limits: /proc/sys/fs/file-max, /proc/sys/vm/drop_caches, etc.

Miscellaneous Commands

History: history.

Identify Linux distribution: cat /etc/*-release.

Get host IP via netstat -ie | awk ....

Clear caches: echo 3 > /proc/sys/vm/drop_caches.

Find top CPU threads: ps -mp PID -o THREAD,tid,time | sort -rn.

Detect high‑IO processes using iotop, dmesg, or iostat -x 1.

Configure DNS client by editing /etc/resolv.conf and /etc/nsswitch.conf.

Change hostname temporarily ( hostname NEW_HOST) or permanently via /etc/hostname and restart network.

PerformanceLinuxshellsysadminprocfs
Full-Stack DevOps & Kubernetes
Written by

Full-Stack DevOps & Kubernetes

Focused on sharing DevOps, Kubernetes, Linux, Docker, Istio, microservices, Spring Cloud, Python, Go, databases, Nginx, Tomcat, cloud computing, and related technologies.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.