Operations 13 min read

Essential Linux Commands for Analyzing Web Server Logs

A comprehensive collection of practical Linux one‑liners—using awk, grep, sort, uniq, and netstat—to count unique IPs, track page visits, rank URLs, monitor connection states, and measure traffic from Apache access logs for effective server operations.

Liangxu Linux
Liangxu Linux
Liangxu Linux
Essential Linux Commands for Analyzing Web Server Logs

Overview

This collection provides Linux command‑line one‑liners for analysing Apache (or generic) access logs. The commands enable counting unique visitors, ranking page requests, filtering by time or URL, measuring bandwidth, and inspecting TCP connection states, offering a quick toolbox for server‑side monitoring and troubleshooting.

1. Count unique IP addresses

awk '{print $1}' log_file | sort | uniq | wc -l

2. Count accesses to a specific page

grep "/index.php" log_file | wc -l

3. Count how many pages each IP accessed

awk '{++S[$1]} END {for (a in S) print a, S[a]}' log_file > log.txt
sort -n -t ' ' -k2 log.txt

4. Sort IP page counts from smallest to largest

awk '{++S[$1]} END {for (a in S) print S[a], a}' log_file | sort -n

5. List pages visited by a particular IP

grep ^111.111.111.111 log_file | awk '{print $1, $7}'

6. Exclude search‑engine crawlers from statistics

awk '{print $12, $1}' log_file | grep ^"Mozilla" | awk '{print $2}' | sort | uniq | wc -l

7. Count IPs that accessed the site during a specific hour

awk '{print $4,$1}' log_file | grep 16/Aug/2015:14 | awk '{print $2}' | sort | uniq | wc -l

8. Top 10 IP addresses by request count

awk '{print $1}' access_log | sort | uniq -c | sort -nr | head -10

9. Top 10 most requested URLs

cat log_file | awk '{print $11}' | sort | uniq -c | sort -nr | head -10

10. Rank sub‑domains by request volume (based on referer)

cat access.log | awk '{print $11}' | sed -e 's/http:\/\///' -e 's/\/.*//' | sort | uniq -c | sort -rn | head -20

11. Largest transferred files

cat www.access.log | awk '($7~/\.php/){print $10, $1, $4, $7}' | sort -nr | head -100

12. Pages larger than 200 KB and their request counts

cat www.access.log | awk '($10 > 200000 && $7~/\.php/){print $7}' | sort | uniq -c | sort -nr | head -100

13. Slowest PHP pages (by response time)

cat www.access.log | awk '($7~/\.php/){print $NF, $1, $4, $7}' | sort -nr | head -100

14. PHP pages taking more than 60 seconds

cat www.access.log | awk '($NF > 60 && $7~/\.php/){print $7}' | sort -n | uniq -c | sort -nr | head -100

15. Requests with response time > 30 seconds

cat www.access.log | awk '($NF > 30){print $7}' | sort -n | uniq -c | sort -nr | head -20

16. List running processes and their frequencies

ps -ef | awk -F ' ' '{print $8, $9}' | sort | uniq -c | sort -nr | head -20

17. Current Apache concurrent connections

netstat -an | grep ESTABLISHED | wc -l

18. Count Apache processes (one per request)

ps -ef | grep httpd | wc -l

19. Total connections on port 80

netstat -nat | grep -i "80" | wc -l

20. Detailed TCP state statistics

netstat -n | awk '/^tcp/ {++S[$NF]} END {for (a in S) print a, S[a]}'

21. Top 20 IPs by total connections

netstat -n | awk '/^tcp/ {++S[$NF]} END {for (a in S) print S[a], a}' | sort -rn | head -20

22. Identify IPs with many TIME_WAIT sockets

netstat -n | grep TIME_WAIT | awk '{print $5}' | sort | uniq -c | sort -rn | head -20

23. Identify IPs with many SYN sockets

netstat -an | grep SYN | awk '{print $5}' | awk -F: '{print $1}' | sort | uniq -c | sort -nr | head -20

24. Calculate total traffic in gigabytes

cat access.log | awk '{sum+=$10} END {print sum/1024/1024/1024}'

25. Count 404 responses

awk '($9 ~ /404/)' access.log | awk '{print $9, $7}' | sort

26. Summarise HTTP status codes

cat access.log | awk '{counts[$9]++} END {for (code in counts) print code, counts[code]}'

27. Per‑second request rate for specific status codes

watch "awk '{if($9~/200|30|404/) COUNT[$4]++} END {for(a in COUNT) print a, COUNT[a]}' log_file | sort -k2 -nr | head -10"

Source: https://segmentfault.com/a/1190000009745139

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

LinuxServer Monitoringlog analysisGrepawk
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.