Operations 12 min read

Master Log Analysis: 20+ Essential Linux Commands to Uncover Web Traffic Insights

This guide presents a comprehensive collection of Linux one‑liners—using awk, grep, sort, uniq, netstat, and other tools—to count unique IPs, rank pages, filter bots, measure bandwidth, monitor connection states, and extract time‑based statistics from Apache access logs.

Liangxu Linux
Liangxu Linux
Liangxu Linux
Master Log Analysis: 20+ Essential Linux Commands to Uncover Web Traffic Insights

1. Count unique IP addresses

awk '{print $1}' log_file | sort | uniq | wc -l

2. Count visits to a specific page

grep "/index.php" log_file | wc -l

3. Count pages visited per IP

awk '{++S[$1]} END {for (a in S) print a, S[a]}' log_file > log.txt

Then sort the result:

sort -n -t ' ' -k2 log.txt

4. Sort IP visit counts from smallest to largest

awk '{++S[$1]} END {for (a in S) print S[a], a}' log_file | sort -n

5. List pages visited by a specific IP

grep ^111.111.111.111 log_file | awk '{print $1, $7}'

6. Exclude search‑engine crawlers

awk '{print $12, $1}' log_file | grep '^"Mozilla' | awk '{print $2}' | sort | uniq | wc -l

7. Count IPs that accessed the site during a specific hour

awk '{print $4, $1}' log_file | grep '16/Aug/2015:14' | awk '{print $2}' | sort | uniq | wc -l

8. Top 10 IP addresses by request count

awk '{print $1}' log_file | sort | uniq -c | sort -nr | head -10

9. Top 10 most requested files or pages

awk '{print $11}' log_file | sort | uniq -c | sort -nr | head -10

10. Top 20 IPs by request volume

awk '{print $11}' log_file | sort | uniq -c | sort -nr | head -20

11. Identify the largest transferred files

awk '($7~/\.php/){print $10, $1, $4, $7}' www.access.log | sort -nr | head -100

12. Files larger than 200 KB and their hit counts

awk '($10 > 200000 && $7~/\.php/){print $7}' www.access.log | sort | uniq -c | sort -nr | head -100

13. Slowest pages (by response time)

awk '($7~/\.php/){print $NF, $1, $4, $7}' www.access.log | sort -nr | head -100

14. Pages taking more than 60 seconds

awk '($NF > 60 && $7~/\.php/){print $7}' www.access.log | sort | uniq -c | sort -nr | head -100

15. Requests per second (watch)

watch "awk '{if($9~/200|30|404/)COUNT[$4]++} END {for(a in COUNT) print a, COUNT[a]}' log_file | sort -k2 -nr | head -10"

16. Bandwidth statistics

Request count:

awk '{if($7~/GET/) count++} END {print \"client_request=\"count}' apache.log

Data transferred (KB):

awk '{BYTE+=$11} END {print \"client_kbyte_out=\"BYTE/1024 \"KB\"}' apache.log

17. TCP connection state summary

netstat -n | awk '/^tcp/ {++S[$NF]} END {for (a in S) print a, S[a]}'

18. Count connections per state

netstat -n | awk '/^tcp/ {++state[$NF]} END {for (k in state) print k, "\t", state[k]}'

19. Top 20 IPs by number of connections (often used for attack source hunting)

netstat -ant | awk '/:80/ {split($5,ip,":"); ++A[ip[1]]} END {for (i in A) print A[i], i}' | sort -rn | head -20

20. Capture the top 20 source IPs on port 80 with tcpdump

tcpdump -i eth0 -tnn dst port 80 -c 1000 | awk -F"." '{print $1"."$2"."$3"."$4}' | sort | uniq -c | sort -nr | head -20

21. Identify many TIME_WAIT connections

netstat -n | grep TIME_WAIT | awk '{print $5}' | sort | uniq -c | sort -rn | head -20

22. Identify many SYN connections

netstat -an | grep SYN | awk '{print $5}' | awk -F: '{print $1}' | sort | uniq -c | sort -nr | head

23. Map processes to ports

netstat -ntlp | grep 80 | awk '{print $7}' | cut -d/ -f1

All commands assume the standard Apache combined log format and can be combined or adapted to fit custom log layouts.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Apachelog analysisGrepnetstatShell Commandsawk
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.