Master Server Log Analysis: 30 Essential Linux Commands to Uncover Traffic Insights
This guide presents a comprehensive collection of Linux command‑line techniques—including awk, grep, and netstat—to help you analyze web server logs, identify unique visitors, track page popularity, monitor connection states, and detect performance bottlenecks in a systematic way.
Running a personal website on an Alibaba Cloud ECS instance, the author regularly inspects server logs to monitor traffic and detect potential attacks, sharing a curated set of useful command‑line snippets for log analysis.
Basic Log Queries
<code>awk '{print $1}' log_file | sort | uniq | wc -l</code>Counts the number of distinct IP addresses that accessed the server.
<code>grep "/index.php" log_file | wc -l</code>Shows how many times a specific page was requested.
<code>awk '{++S[$1]} END {for (a in S) print a, S[a]}' log_file > log.txt</code>Lists each IP address together with the number of pages it requested.
<code>awk '{++S[$1]} END {for (a in S) print S[a], a}' log_file | sort -n</code>Sorts IPs by the number of pages accessed, from fewest to most.
<code>grep ^111.111.111.111 log_file | awk '{print $1,$7}'</code>Shows all pages requested by a particular IP.
<code>awk '{print $12,$1}' log_file | grep ^"Mozilla" | awk '{print $2}' | sort | uniq | wc -l</code>Excludes requests from search engine crawlers.
<code>awk '{print $4,$1}' log_file | grep 16/Aug/2015:14 | awk '{print $2}' | sort | uniq | wc -l</code>Counts unique IPs that accessed the site during a specific hour.
<code>awk '{print $1}' | sort | uniq -c | sort -nr | head -10 access_log</code>Displays the top ten IP addresses by request count.
<code>cat access.log | awk '{print $1}' | sort | uniq -c | sort -nr | head -10</code>Another form of the top‑IP query.
<code>cat log_file | awk '{print $11}' | sort | uniq -c | sort -nr | head -10</code>Shows the ten most requested files or pages.
<code>cat access.log | awk '{print $11}' | sort | uniq -c | sort -nr | head -20</code>Lists the top twenty accessed resources.
<code>cat www.access.log | awk '($7~/\.php/){print $10 " " $1 " " $4 " " $7}' | sort -nr | head -100</code>Finds the largest transferred PHP files.
<code>cat www.access.log | awk '($10 > 200000 && $7~/\.php/){print $7}' | sort -n | uniq -c | sort -nr | head -100</code>Shows PHP pages larger than ~200 KB and their request frequencies.
<code>cat www.access.log | awk '($7~/\.php/){print $NF " " $1 " " $4 " " $7}' | sort -nr | head -100</code>Lists pages with the longest client‑side transfer times.
<code>cat www.access.log | awk '($NF > 60 && $7~/\.php/){print $7}' | sort -n | uniq -c | sort -nr | head -100</code>Identifies PHP pages that took more than 60 seconds to serve.
<code>cat www.access.log | awk '($NF > 30){print $7}' | sort -n | uniq -c | sort -nr | head -20</code>Shows pages with transfer times over 30 seconds.
Process and Connection Monitoring
<code>ps -ef | awk -F ' ' '{print $8 " " $9}' | sort | uniq -c | sort -nr | head -20</code>Counts running processes by name.
<code>netstat -an | grep ESTABLISHED | wc -l</code>Counts current established connections (useful for Apache concurrency).
<code>ps -ef | grep httpd | wc -l</code>Shows the number of Apache worker processes.
<code>netstat -nat | grep -i "80" | wc -l</code>Totals all connections to port 80.
<code>netstat -na | grep ESTABLISHED | wc -l</code>Counts established TCP connections.
<code>netstat -n | awk '/^tcp/ {n=split($(NF-1),array,":");if(n<=2)++S[array[1]];else++S[array[4]];++s[$NF];++N} END {for(a in S){printf("%-20s %s\n", a, S[a]);} printf("%-20s %s\n","TOTAL_IP",NR); for(a in s) printf("%-20s %s\n",a, s[a]); printf("%-20s %s\n","TOTAL_LINK",N);}'</code>Outputs per‑IP connection counts and overall TCP state statistics.
<code>cat access.log | grep '04/May/2012' | awk '{print $11}' | sort | uniq -c | sort -nr | head -20</code>Finds the top 20 URLs on a specific date.
<code>cat access_log | awk '($11~/\www.abc.com/){print $1}' | sort | uniq -c | sort -nr</code>Lists IPs that accessed URLs containing "www.abc.com".
<code>cat access.log | grep "20/Mar/2011" | awk '{print $3}' | sort | uniq -c | sort -nr | head</code>Shows the IPs with the most visits on a given day.
<code>awk '{print $1}' access.log | grep "20/Mar/2011" | cut -c 14-18 | sort | uniq -c | sort -nr | head</code>Identifies the ten busiest minute intervals.
<code>netstat -nat | awk '{print $6}' | sort | uniq -c | sort -rn</code>Summarizes TCP connection states (e.g., ESTABLISHED, TIME_WAIT).
<code>netstat -n | awk '/^tcp/ {++state[$NF]}; END {for(key in state) print key, "\t", state[key]}'</code>Another view of TCP state distribution.
<code>netstat -anlp | grep 80 | grep tcp | awk '{print $5}' | awk -F: '{print $1}' | sort | uniq -c | sort -nr | head -n20</code>Finds the top 20 source IPs connecting to port 80.
<code>tcpdump -i eth0 -tnn dst port 80 -c 1000 | awk -F"." '{print $1"."$2"."$3"."$4}' | sort | uniq -c | sort -nr | head -20</code>Sniffs the 80 port traffic and lists the most frequent remote IPs.
Additional Metrics
<code>cat access.log | awk '{sum+=$10} END {print sum/1024/1024/1024}'</code>Calculates total traffic volume in gigabytes.
<code>awk '($9 ~/404/)' access.log | awk '{print $9,$7}' | sort</code>Lists all 404 error requests.
<code>cat access.log | awk '{counts[$9]++} END {for(code in counts) print code, counts[code]}'</code>Shows the distribution of HTTP status codes.
<code>watch "awk '{if($9~/200|30|404/)COUNT[$4]++}END{for(a in COUNT) print a,COUNT[a]}' log_file | sort -k 2 -nr | head -n10"</code>Monitors per‑minute request counts for selected status codes.
Source: SegmentFault article
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.