Master Server Log Analysis: 30 Essential Linux Commands to Uncover Traffic Insights
This guide presents a comprehensive collection of Linux command‑line techniques—including awk, grep, and netstat—to help you analyze web server logs, identify unique visitors, track page popularity, monitor connection states, and detect performance bottlenecks in a systematic way.
Running a personal website on an Alibaba Cloud ECS instance, the author regularly inspects server logs to monitor traffic and detect potential attacks, sharing a curated set of useful command‑line snippets for log analysis.
Basic Log Queries
awk '{print $1}' log_file | sort | uniq | wc -lCounts the number of distinct IP addresses that accessed the server. grep "/index.php" log_file | wc -l Shows how many times a specific page was requested.
awk '{++S[$1]} END {for (a in S) print a, S[a]}' log_file > log.txtLists each IP address together with the number of pages it requested.
awk '{++S[$1]} END {for (a in S) print S[a], a}' log_file | sort -nSorts IPs by the number of pages accessed, from fewest to most.
grep ^111.111.111.111 log_file | awk '{print $1,$7}'Shows all pages requested by a particular IP.
awk '{print $12,$1}' log_file | grep ^"Mozilla" | awk '{print $2}' | sort | uniq | wc -lExcludes requests from search engine crawlers.
awk '{print $4,$1}' log_file | grep 16/Aug/2015:14 | awk '{print $2}' | sort | uniq | wc -lCounts unique IPs that accessed the site during a specific hour.
awk '{print $1}' | sort | uniq -c | sort -nr | head -10 access_logDisplays the top ten IP addresses by request count.
cat access.log | awk '{print $1}' | sort | uniq -c | sort -nr | head -10Another form of the top‑IP query.
cat log_file | awk '{print $11}' | sort | uniq -c | sort -nr | head -10Shows the ten most requested files or pages.
cat access.log | awk '{print $11}' | sort | uniq -c | sort -nr | head -20Lists the top twenty accessed resources.
cat www.access.log | awk '($7~/\.php/){print $10 " " $1 " " $4 " " $7}' | sort -nr | head -100Finds the largest transferred PHP files.
cat www.access.log | awk '($10 > 200000 && $7~/\.php/){print $7}' | sort -n | uniq -c | sort -nr | head -100Shows PHP pages larger than ~200 KB and their request frequencies.
cat www.access.log | awk '($7~/\.php/){print $NF " " $1 " " $4 " " $7}' | sort -nr | head -100Lists pages with the longest client‑side transfer times.
cat www.access.log | awk '($NF > 60 && $7~/\.php/){print $7}' | sort -n | uniq -c | sort -nr | head -100Identifies PHP pages that took more than 60 seconds to serve.
cat www.access.log | awk '($NF > 30){print $7}' | sort -n | uniq -c | sort -nr | head -20Shows pages with transfer times over 30 seconds.
Process and Connection Monitoring
ps -ef | awk -F ' ' '{print $8 " " $9}' | sort | uniq -c | sort -nr | head -20Counts running processes by name. netstat -an | grep ESTABLISHED | wc -l Counts current established connections (useful for Apache concurrency). ps -ef | grep httpd | wc -l Shows the number of Apache worker processes. netstat -nat | grep -i "80" | wc -l Totals all connections to port 80. netstat -na | grep ESTABLISHED | wc -l Counts established TCP connections.
netstat -n | awk '/^tcp/ {n=split($(NF-1),array,":");if(n<=2)++S[array[1]];else++S[array[4]];++s[$NF];++N} END {for(a in S){printf("%-20s %s
", a, S[a]);} printf("%-20s %s
","TOTAL_IP",NR); for(a in s) printf("%-20s %s
",a, s[a]); printf("%-20s %s
","TOTAL_LINK",N);}'Outputs per‑IP connection counts and overall TCP state statistics.
cat access.log | grep '04/May/2012' | awk '{print $11}' | sort | uniq -c | sort -nr | head -20Finds the top 20 URLs on a specific date.
cat access_log | awk '($11~/\www.abc.com/){print $1}' | sort | uniq -c | sort -nrLists IPs that accessed URLs containing "www.abc.com".
cat access.log | grep "20/Mar/2011" | awk '{print $3}' | sort | uniq -c | sort -nr | headShows the IPs with the most visits on a given day.
awk '{print $1}' access.log | grep "20/Mar/2011" | cut -c 14-18 | sort | uniq -c | sort -nr | headIdentifies the ten busiest minute intervals.
netstat -nat | awk '{print $6}' | sort | uniq -c | sort -rnSummarizes TCP connection states (e.g., ESTABLISHED, TIME_WAIT).
netstat -n | awk '/^tcp/ {++state[$NF]}; END {for(key in state) print key, "\t", state[key]}'Another view of TCP state distribution.
netstat -anlp | grep 80 | grep tcp | awk '{print $5}' | awk -F: '{print $1}' | sort | uniq -c | sort -nr | head -n20Finds the top 20 source IPs connecting to port 80.
tcpdump -i eth0 -tnn dst port 80 -c 1000 | awk -F"." '{print $1"."$2"."$3"."$4}' | sort | uniq -c | sort -nr | head -20Sniffs the 80 port traffic and lists the most frequent remote IPs.
Additional Metrics
cat access.log | awk '{sum+=$10} END {print sum/1024/1024/1024}'Calculates total traffic volume in gigabytes.
awk '($9 ~/404/)' access.log | awk '{print $9,$7}' | sortLists all 404 error requests.
cat access.log | awk '{counts[$9]++} END {for(code in counts) print code, counts[code]}'Shows the distribution of HTTP status codes.
watch "awk '{if($9~/200|30|404/)COUNT[$4]++}END{for(a in COUNT) print a,COUNT[a]}' log_file | sort -k 2 -nr | head -n10"Monitors per‑minute request counts for selected status codes.
Source: SegmentFault article
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
