Operations 15 min read

Master Server Log Analysis with Essential Linux Commands

This guide compiles a comprehensive set of Linux commands—using awk, grep, netstat, and more—to help you analyze web server logs, track traffic, identify top IPs, monitor connection states, and detect performance bottlenecks on an Alibaba Cloud ECS instance.

Efficient Ops
Efficient Ops
Efficient Ops
Master Server Log Analysis with Essential Linux Commands

I run a personal website on an Alibaba Cloud ECS server and occasionally analyze its access logs to monitor traffic and spot potential attacks. Below is a curated list of useful Linux commands for log analysis.

1. Count unique visitor IPs:

<code>awk '{print $1}' log_file | sort | uniq | wc -l</code>

2. Count visits to a specific page (e.g.,

/index.php

):

<code>grep "/index.php" log_file | wc -l</code>

3. Show how many pages each IP accessed:

<code>awk '{++S[$1]} END {for (a in S) print a,S[a]}' log_file > log.txt
sort -n -t ' ' -k 2 log.txt</code>

4. Sort IPs by the number of pages they accessed (ascending):

<code>awk '{++S[$1]} END {for (a in S) print S[a],a}' log_file | sort -n</code>

5. List pages visited by a specific IP (replace with the target IP):

<code>grep ^111.111.111.111 log_file | awk '{print $1,$7}'</code>

6. Exclude search‑engine crawlers from the count:

<code>awk '{print $12,$1}' log_file | grep ^"Mozilla" | awk '{print $2}' | sort | uniq | wc -l</code>

7. Count unique IPs within a specific hour (e.g., 14:00 on 16 Aug 2015):

<code>awk '{print $4,$1}' log_file | grep 16/Aug/2015:14 | awk '{print $2}' | sort | uniq | wc -l</code>

8. Show the top 10 IP addresses by request count:

<code>awk '{print $1}' log_file | sort | uniq -c | sort -nr | head -10</code>
Note: uniq -c prefixes each line with its occurrence count.

9. List the 10 most requested files or pages:

<code>cat log_file | awk '{print $11}' | sort | uniq -c | sort -nr | head -10</code>

10. Count requests per sub‑domain (using the Referer header):

<code>cat access.log | awk '{print $11}' | sed -e 's/http:\/\///' -e 's/\/.*//' | sort | uniq -c | sort -rn | head -20</code>

11. Find the largest transferred files:

<code>cat www.access.log | awk '($7~/\.php/){print $10 " " $1 " " $4 " " $7}' | sort -nr | head -100</code>

12. List pages larger than 200 KB and their request counts:

<code>cat www.access.log | awk '($10 > 200000 && $7~/\.php/){print $7}' | sort -n | uniq -c | sort -nr | head -100</code>

13. Identify the slowest pages (by transfer time) if the last column records duration:

<code>cat www.access.log | awk '($7~/\.php/){print $NF " " $1 " " $4 " " $7}' | sort -nr | head -100</code>

14. Show pages taking more than 60 seconds and their frequencies:

<code>cat www.access.log | awk '($NF > 60 && $7~/\.php/){print $7}' | sort -n | uniq -c | sort -nr | head -100</code>

15. Show pages taking more than 30 seconds:

<code>cat www.access.log | awk '($NF > 30){print $7}' | sort -n | uniq -c | sort -nr | head -20</code>

16. List the number of processes per command (sorted descending):

<code>ps -ef | awk -F ' ' '{print $8 " " $9}' | sort | uniq -c | sort -nr | head -20</code>

17. Check current Apache concurrent connections:

<code>netstat -an | grep ESTABLISHED | wc -l</code>

18. Count Apache processes (each request may spawn a process):

<code>ps -ef | grep httpd | wc -l</code>

19. Show total connections per IP and overall connection states:

<code>netstat -n | awk '/^tcp/ {n=split($(NF-1),array,":");if(n<=2)++S[array[1]];else++S[array[4]];++s[$NF];++N} END {for(a in S){printf("%-20s %s\n", a, S[a]);++I}printf("%-20s %s\n","TOTAL_IP",I);for(a in s) printf("%-20s %s\n",a, s[a]);printf("%-20s %s\n","TOTAL_LINK",N);}'</code>

20. Find the top 20 URLs on a specific date (e.g., 04 May 2012):

<code>cat access.log | grep '04/May/2012' | awk '{print $11}' | sort | uniq -c | sort -nr | head -20</code>

21. List IPs that accessed a particular domain (e.g.,

www.abc.com

):

<code>cat access_log | awk '($11~/www\.abc\.com/){print $1}' | sort | uniq -c | sort -nr</code>

22. Show the top 10 IPs for a given day and optionally filter by time range:

<code>cat log_file | egrep '15/Aug/2015|16/Aug/2015' | awk '{print $1}' | sort | uniq -c | sort -nr | head -10</code>

23. Calculate total traffic in gigabytes:

<code>cat access.log | awk '{sum+=$10} END {print sum/1024/1024/1024}'</code>

24. List all 404 responses:

<code>awk '($9 ~/404/)' access.log | awk '{print $9,$7}' | sort</code>

25. Summarize HTTP status codes:

<code>cat access.log | awk '{counts[$9]++} END {for(code in counts) print code, counts[code]}'
cat access.log | awk '{print $9}' | sort | uniq -c | sort -rn</code>

26. Show per‑second request rates for specific status codes:

<code>watch "awk '{if($9~/200|30|404/)COUNT[$4]++} END {for(a in COUNT) print a,COUNT[a]}' log_file | sort -k 2 -nr | head -n10"</code>

27. Estimate client request count (simple bandwidth metric):

<code>cat apache.log | awk '{if($7~/GET/) count++} END {print "client_request="count}'</code>

28. Identify the 10 IPs with the most accesses on a particular day:

<code>cat /tmp/access.log | grep "20/Mar/2011" | awk '{print $3}' | sort | uniq -c | sort -nr | head</code>

29. Find what the most active IP was doing on a given day:

<code>cat access.log | grep "10.0.21.17" | awk '{print $8}' | sort | uniq -c | sort -nr | head -n10</code>

30. Determine the 10 busiest hour‑long intervals (by IP connections):

<code>awk -vFS=":" '{gsub("-.*","",$1); num[$2" "$1]++} END {for(i in num) print i,num[i]}' log_file | sort -n -k 3 -r | head -10</code>

31. Find the minutes with the highest request volume:

<code>awk '{print $1}' access.log | grep "20/Mar/2011" | cut -c 14-18 | sort | uniq -c | sort -nr | head</code>

32. Monitor TCP connection states in real time:

<code>watch "netstat -n | awk '/^tcp/ {++S[\$NF]} END {for(a in S) print a, S[a]}'"</code>
Common TCP states: LAST_ACK (connection closing), SYN_RECV (waiting for processing), ESTABLISHED (normal data transfer), FIN_WAIT1 (server initiates close), FIN_WAIT2 (client aborts), TIME_WAIT (awaiting timeout).

Source: segmentfault.com

networkLinuxserver monitoringLog Analysisawk
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.