How to Efficiently Split and Merge Large Log Files on Linux with the split Command
This guide explains why traditional tools struggle with massive log files and demonstrates how to use Linux's split command to divide logs by line count or size, then recombine them, making analysis faster, less memory‑intensive, and easier to share.
Analyzing huge log files on Linux with tools like vim, cat, grep or awk can be painfully slow, consume excessive memory, and make file transfer and reuse difficult.
Why split large logs?
Splitting a large log into smaller chunks reduces I/O, lowers memory usage, and allows parallel or incremental analysis. While Hadoop can process big data, it requires writing MapReduce jobs and incurs long processing times, making the simple split utility a practical alternative.
Using split by line count
Example: a 3.4 GB log file is split into pieces of 50 000 lines each, with numeric suffixes.
# Check source file size
ls -lh 2020011702-www.happylauliu.cn.gz -h
# Split by lines
split -l 50000 -d --verbose 2020011702-www.happylauliu.cn.gz split-line
# Verify line count of a piece
wc -l split-line00
# List piece sizes
ls -lh split-line0[0-9]Each resulting file is about 14 MB, making it much easier to open, search, or process individually.
Using split by file size
The -b option lets you split by byte size, supporting units K, M, G, T, etc. The example below creates 500 MB chunks.
# Split by size
split -b 500M -d --verbose 2020011702-www.happylauliu.cn.gz split-size
# List created files
ls -lh split-size0*Merging split files back together
To recombine pieces, use shell redirection with cat:
# Concatenate two pieces
cat split-size01 split-size02 > two-file-merge
# Verify merged size
ls -lh two-file-mergeNote that merging large files still incurs I/O overhead, so use it only when necessary.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
