Master Log Analysis: 20+ Essential Linux Commands to Uncover Web Traffic Insights
This guide presents a comprehensive collection of Linux one‑liners—using awk, grep, sort, uniq, netstat, and other tools—to count unique IPs, rank pages, filter bots, measure bandwidth, monitor connection states, and extract time‑based statistics from Apache access logs.
1. Count unique IP addresses
awk '{print $1}' log_file | sort | uniq | wc -l2. Count visits to a specific page
grep "/index.php" log_file | wc -l3. Count pages visited per IP
awk '{++S[$1]} END {for (a in S) print a, S[a]}' log_file > log.txtThen sort the result:
sort -n -t ' ' -k2 log.txt4. Sort IP visit counts from smallest to largest
awk '{++S[$1]} END {for (a in S) print S[a], a}' log_file | sort -n5. List pages visited by a specific IP
grep ^111.111.111.111 log_file | awk '{print $1, $7}'6. Exclude search‑engine crawlers
awk '{print $12, $1}' log_file | grep '^"Mozilla' | awk '{print $2}' | sort | uniq | wc -l7. Count IPs that accessed the site during a specific hour
awk '{print $4, $1}' log_file | grep '16/Aug/2015:14' | awk '{print $2}' | sort | uniq | wc -l8. Top 10 IP addresses by request count
awk '{print $1}' log_file | sort | uniq -c | sort -nr | head -109. Top 10 most requested files or pages
awk '{print $11}' log_file | sort | uniq -c | sort -nr | head -1010. Top 20 IPs by request volume
awk '{print $11}' log_file | sort | uniq -c | sort -nr | head -2011. Identify the largest transferred files
awk '($7~/\.php/){print $10, $1, $4, $7}' www.access.log | sort -nr | head -10012. Files larger than 200 KB and their hit counts
awk '($10 > 200000 && $7~/\.php/){print $7}' www.access.log | sort | uniq -c | sort -nr | head -10013. Slowest pages (by response time)
awk '($7~/\.php/){print $NF, $1, $4, $7}' www.access.log | sort -nr | head -10014. Pages taking more than 60 seconds
awk '($NF > 60 && $7~/\.php/){print $7}' www.access.log | sort | uniq -c | sort -nr | head -10015. Requests per second (watch)
watch "awk '{if($9~/200|30|404/)COUNT[$4]++} END {for(a in COUNT) print a, COUNT[a]}' log_file | sort -k2 -nr | head -10"16. Bandwidth statistics
Request count:
awk '{if($7~/GET/) count++} END {print \"client_request=\"count}' apache.logData transferred (KB):
awk '{BYTE+=$11} END {print \"client_kbyte_out=\"BYTE/1024 \"KB\"}' apache.log17. TCP connection state summary
netstat -n | awk '/^tcp/ {++S[$NF]} END {for (a in S) print a, S[a]}'18. Count connections per state
netstat -n | awk '/^tcp/ {++state[$NF]} END {for (k in state) print k, "\t", state[k]}'19. Top 20 IPs by number of connections (often used for attack source hunting)
netstat -ant | awk '/:80/ {split($5,ip,":"); ++A[ip[1]]} END {for (i in A) print A[i], i}' | sort -rn | head -2020. Capture the top 20 source IPs on port 80 with tcpdump
tcpdump -i eth0 -tnn dst port 80 -c 1000 | awk -F"." '{print $1"."$2"."$3"."$4}' | sort | uniq -c | sort -nr | head -2021. Identify many TIME_WAIT connections
netstat -n | grep TIME_WAIT | awk '{print $5}' | sort | uniq -c | sort -rn | head -2022. Identify many SYN connections
netstat -an | grep SYN | awk '{print $5}' | awk -F: '{print $1}' | sort | uniq -c | sort -nr | head23. Map processes to ports
netstat -ntlp | grep 80 | awk '{print $7}' | cut -d/ -f1All commands assume the standard Apache combined log format and can be combined or adapted to fit custom log layouts.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
