Speed Up Log Searching: A Practical awk, tail, grep, and sed Toolkit
When a colleague struggles with a 2 GB log file, the author demonstrates how to combine tail, less, grep, sed, and awk commands to quickly locate errors, extract time windows, count occurrences, and analyze traffic, turning cumbersome log inspection into an efficient, repeatable workflow.
In a production incident where a 2 GB log file flooded the terminal, the author shows that mastering Linux log‑analysis commands is essential for backend developers. The guide covers five core tools— tail, less, grep, sed, and awk —and presents concrete scenarios for each.
tail
Newcomers often use cat on large files, which can freeze the terminal. tail -f logs/application.log follows the file in real time, ideal for monitoring service start‑up logs during a deployment.
When only the latest 200 lines are needed, use:
# Show the last 200 lines and keep updating
tail -n 200 -f logs/application.logless
lessloads files on demand, making it suitable for browsing multi‑gigabyte logs without exhausting memory, unlike vim. To investigate a specific order failure, open the log with less and:
Press Shift+G to jump to the end.
Enter ?ORD12345678 to search backward for the order ID.
Press n to find the previous occurrence if needed.
Use Shift+F for a live‑follow mode similar to tail -f, and Ctrl+C to return to normal browsing.
grep
grepis the go‑to search tool, but simple keyword matches often miss context. To view the 20 lines surrounding a NullPointerException:
# Show the matching line plus 20 lines before and after
grep -C 20 "NullPointerException" logs/application.logFor tracing a specific TraceId across rotated logs:
# Search all files starting with app.log for the TraceId
grep "TraceId-20251219001" logs/app.log*To count how many times a Redis timeout occurred:
# Count matching lines only
grep -c "RedisConnectionException" logs/application.logTo exclude noisy health‑check lines:
# Show all lines that do NOT contain "HealthCheck"
grep -v "HealthCheck" logs/application.logsed
When a log is huge (e.g., 10 GB) but the incident window is known, sed can extract that time slice:
# Extract logs between the start and end timestamps
sed -n '/2025-12-19 14:00/,/2025-12-19 14:05/p' logs/application.log > error_segment.logThe resulting error_segment.log is a small, manageable file for further analysis or sharing.
awk
awkexcels at column‑based processing. To find the top 10 IPs generating the most requests (useful during a suspected CC attack):
# Extract IP column, sort, count, and list top 10
awk '{print $1}' access.log | sort | uniq -c | sort -nr | head -n 10To locate URLs with response times over 1 second (assuming response time is the last field and URL is column 7):
# Print URL and response time for slow requests
awk '$NF > 1.000 {print $7, $NF}' access.logThese examples constitute a ready‑to‑copy toolkit that can be applied directly to production log‑analysis tasks.
Conclusion
The author recommends memorizing or bookmarking these command combinations so that, when a production issue arises, engineers can instantly apply the appropriate tool without reinventing the wheel.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Programmer XiaoFu
xiaofucode.com – a programmer learning guide driven by the pursuit of profit
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
