Essential Linux Commands for Analyzing Web Server Logs
A comprehensive collection of practical Linux one‑liners—using awk, grep, sort, uniq, and netstat—to count unique IPs, track page visits, rank URLs, monitor connection states, and measure traffic from Apache access logs for effective server operations.
Overview
This collection provides Linux command‑line one‑liners for analysing Apache (or generic) access logs. The commands enable counting unique visitors, ranking page requests, filtering by time or URL, measuring bandwidth, and inspecting TCP connection states, offering a quick toolbox for server‑side monitoring and troubleshooting.
1. Count unique IP addresses
awk '{print $1}' log_file | sort | uniq | wc -l2. Count accesses to a specific page
grep "/index.php" log_file | wc -l3. Count how many pages each IP accessed
awk '{++S[$1]} END {for (a in S) print a, S[a]}' log_file > log.txt
sort -n -t ' ' -k2 log.txt4. Sort IP page counts from smallest to largest
awk '{++S[$1]} END {for (a in S) print S[a], a}' log_file | sort -n5. List pages visited by a particular IP
grep ^111.111.111.111 log_file | awk '{print $1, $7}'6. Exclude search‑engine crawlers from statistics
awk '{print $12, $1}' log_file | grep ^"Mozilla" | awk '{print $2}' | sort | uniq | wc -l7. Count IPs that accessed the site during a specific hour
awk '{print $4,$1}' log_file | grep 16/Aug/2015:14 | awk '{print $2}' | sort | uniq | wc -l8. Top 10 IP addresses by request count
awk '{print $1}' access_log | sort | uniq -c | sort -nr | head -109. Top 10 most requested URLs
cat log_file | awk '{print $11}' | sort | uniq -c | sort -nr | head -1010. Rank sub‑domains by request volume (based on referer)
cat access.log | awk '{print $11}' | sed -e 's/http:\/\///' -e 's/\/.*//' | sort | uniq -c | sort -rn | head -2011. Largest transferred files
cat www.access.log | awk '($7~/\.php/){print $10, $1, $4, $7}' | sort -nr | head -10012. Pages larger than 200 KB and their request counts
cat www.access.log | awk '($10 > 200000 && $7~/\.php/){print $7}' | sort | uniq -c | sort -nr | head -10013. Slowest PHP pages (by response time)
cat www.access.log | awk '($7~/\.php/){print $NF, $1, $4, $7}' | sort -nr | head -10014. PHP pages taking more than 60 seconds
cat www.access.log | awk '($NF > 60 && $7~/\.php/){print $7}' | sort -n | uniq -c | sort -nr | head -10015. Requests with response time > 30 seconds
cat www.access.log | awk '($NF > 30){print $7}' | sort -n | uniq -c | sort -nr | head -2016. List running processes and their frequencies
ps -ef | awk -F ' ' '{print $8, $9}' | sort | uniq -c | sort -nr | head -2017. Current Apache concurrent connections
netstat -an | grep ESTABLISHED | wc -l18. Count Apache processes (one per request)
ps -ef | grep httpd | wc -l19. Total connections on port 80
netstat -nat | grep -i "80" | wc -l20. Detailed TCP state statistics
netstat -n | awk '/^tcp/ {++S[$NF]} END {for (a in S) print a, S[a]}'21. Top 20 IPs by total connections
netstat -n | awk '/^tcp/ {++S[$NF]} END {for (a in S) print S[a], a}' | sort -rn | head -2022. Identify IPs with many TIME_WAIT sockets
netstat -n | grep TIME_WAIT | awk '{print $5}' | sort | uniq -c | sort -rn | head -2023. Identify IPs with many SYN sockets
netstat -an | grep SYN | awk '{print $5}' | awk -F: '{print $1}' | sort | uniq -c | sort -nr | head -2024. Calculate total traffic in gigabytes
cat access.log | awk '{sum+=$10} END {print sum/1024/1024/1024}'25. Count 404 responses
awk '($9 ~ /404/)' access.log | awk '{print $9, $7}' | sort26. Summarise HTTP status codes
cat access.log | awk '{counts[$9]++} END {for (code in counts) print code, counts[code]}'27. Per‑second request rate for specific status codes
watch "awk '{if($9~/200|30|404/) COUNT[$4]++} END {for(a in COUNT) print a, COUNT[a]}' log_file | sort -k2 -nr | head -10"Source: https://segmentfault.com/a/1190000009745139
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
