Top 10 Linux Ops Troubleshooting Tips Every Sysadmin Should Know
An experienced Linux sysadmin shares a curated list of common operational issues—from shell script execution failures and cron output overload to disk space leaks, MySQL storage pitfalls, and network latency—detailing root causes, step‑by‑step diagnostics, and practical solutions to keep servers running smoothly.
As a Linux operations engineer, encountering various issues is common; summarizing experiences, diagnosing problems, and analyzing root causes is a good habit.
Below is a collection of frequent faults encountered during projects and their solutions.
Common Issue Solutions
1. Shell script does not execute
Problem: A colleague reports a script fails with “:bad interpreter: No such file or directory”.
Cause: The script was edited on Windows, introducing CRLF line endings (^M).
Solution: Rewrite the script on Linux or remove Windows line endings, e.g. vi command :%s/\r//g or :%s/^M//g (enter ^M with Ctrl‑V Ctrl‑M). Use sh -x to step through scripts.
2. Controlling crontab output
Problem: /var/spool/clientmqueue exceeds 100 GB.
Cause: Cron jobs produce output sent via email; sendmail is not running, causing files to accumulate.
Solution: Delete files manually ( ls | xargs rm -f) or redirect cron output to null: >/dev/null 2>&1.
3. Slow telnet/ssh
Problem: Telnet to a host is very slow, and reverse DNS lookup fails.
Cause: Missing reverse DNS entry for the client IP.
Solution: Add the hostname‑IP mapping to /etc/hosts and ensure a functional nameserver in /etc/resolv.conf.
4. Read‑only file system error
Problem: MySQL cannot create a table, error “ERROR 1005 (HY000): Can't create table … (errno: 30)”.
Cause: Underlying file system is read‑only due to corruption, bad sectors, or fstab misconfiguration.
Solution: For a test machine, reboot; otherwise remount with write permission or fix fstab. In some cases mount can resolve the issue.
5. Deleted files not freeing disk space
Problem: df -h shows 90 GB used, but du -sh /* accounts for only 30 GB.
Cause: A process still holds an open file descriptor to a deleted file.
Solution: Identify the process with lsof | grep deleted and terminate it, or truncate the descriptor: echo > /proc/<pid>/fd/<fd>. Restarting the service or the system also releases space.
6. Inefficient find cleanup
Problem: A nightly find /tmp -name "picture_*" -mtime +1 -exec rm -rf {} \; script causes high load.
Cause: Scanning a directory with many files is resource‑intensive.
Solution: Change to a faster approach, e.g.:
cd /tmp
time=$(date -d "2 days ago" "+%b%d")
ls -l | grep "picture" | grep "$time" | awk '{print $NF}' | xargs rm -rf7. Unable to obtain gateway MAC address
Problem: ARP fails to resolve the gateway MAC.
Solution: Manually bind the MAC address: arp -s 192.168.3.254 00:5e:00:01:64.
8. HTTP service fails to start
Problem: Apache fails with “Address already in use” errors on port 7080.
Cause: Port conflict or duplicate Listen 7080 directives in /etc/httpd/conf/http.conf and /etc/httpd/conf.d/t.10086.cn.conf.
Solution: Comment out the extra Listen 7080 line, then restart Apache.
9. “Too many open files” error
Solution: Increase limits in /etc/security/limits.conf and /root/.bash_profile:
* soft nofile 65535
* hard nofile 65535
* soft nproc 65535
* hard nproc 65535
ulimit -n 65535
ulimit -u 6553510. ibdata1 and mysql‑bin consuming disk space
Problem: ibdata1 >120 GB and mysql‑bin >80 GB.
Cause: Shared tablespace grows without shrinkage; binary logs accumulate.
Solution: Dump and rebuild the database to shrink ibdata1; purge old binary logs with PURGE MASTER LOGS TO 'mysql-bin.010' or PURGE MASTER LOGS BEFORE '2020-12-22 13:00:00'. Set expire_logs_days=30 in my.cnf to automate log removal.
These examples illustrate typical Linux operations troubleshooting steps and best practices.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
