Master Apache Log Analysis with Powerful AWK and Netstat Commands
This guide presents a comprehensive collection of AWK, grep, sed, and netstat one‑liners that let you count unique IPs, rank page visits, measure bandwidth, monitor TCP states, and extract detailed traffic patterns from Apache access logs on Linux systems.
This article compiles a series of practical Linux command‑line recipes for deep analysis of Apache access logs and network connections, using tools such as awk, grep, sed, and netstat.
IP and Page Statistics
Basic queries include counting distinct IP addresses, determining how many times a specific page was requested, and listing the number of pages each IP accessed:
awk '{print $1}' log_file | sort | uniq | wc -l grep "/index.php" log_file | wc -l awk '{++S[$1]} END {for (a in S) print a, S[a]}' log_file > log.txtThese results can be further sorted to rank IPs by request count:
awk '{++S[$1]} END {for (a in S) print S[a], a}' log_file | sort -nAdvanced Log Filtering
Examples show how to isolate requests from a particular IP, exclude search‑engine crawlers, or focus on a specific time window:
grep ^111.111.111.111 log_file | awk '{print $1,$7}' awk '{print $12,$1}' log_file | grep ^"Mozilla " | awk '{print $2}' | sort | uniq | wc -l awk '{print $4,$1}' log_file | grep 16/Aug/2015:14 | awk '{print $2}' | sort | uniq | wc -lTop IPs and URLs
Identify the most active IPs, the most requested URLs, or the largest transferred files:
awk '{print $1}' | sort | uniq -c | sort -nr | head -10 access_log cat access.log | awk '{print $11}' | sort | uniq -c | sort -nr | head -20 cat www.access.log | awk '($7~/\.php/){print $10 " " $1 " " $4 " " $7}' | sort -nr | head -100Bandwidth and Transfer Size
Calculate total traffic in gigabytes and list files exceeding a size threshold:
cat access.log | awk '{sum+=$10} END {print sum/1024/1024/1024}' cat www.access.log | awk '($10 > 200000 && $7~/\.php/){print $7}' | sort -n | uniq -c | sort -nr | head -100TCP Connection Monitoring
Use netstat combined with awk to count established connections, list states, and find ports with the most connections:
netstat -an | grep ESTABLISHED | wc -l netstat -n | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}' netstat -ant | awk '{print $5}' | grep -v '[a-z]' | sort | uniq -cAdditional snippets show how to extract the top IPs by connection count, detect TIME_WAIT or SYN states, and map ports to processes.
Putting It All Together
By chaining these one‑liners, administrators can quickly answer questions such as: which IPs generated the most traffic, which pages are the most bandwidth‑heavy, how many concurrent Apache workers are running, and whether the server is experiencing abnormal TCP states.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
