Operations 8 min read

How to Efficiently Split and Merge Large Log Files on Linux

When log files grow massive, traditional tools like vim, cat, grep, and awk become slow and memory‑hungry, but Linux’s split command lets you divide a huge file by line count or size, process the pieces individually, and later recombine them, dramatically improving analysis efficiency.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
How to Efficiently Split and Merge Large Log Files on Linux

In daily work, analyzing large log files with tools like vim, cat, grep, awk becomes a nightmare due to slow speed, high memory consumption, difficulty reusing filtered output, and cumbersome file transfer.

Slow execution because the file must be loaded into memory, causing many disk reads.

Excessive resource usage; a 4 GB log needs at least 4 GB RAM, larger files need more.

Hard to reuse filtered content; pipelines on large files are inefficient.

File transfer is difficult; full‑size transfer consumes bandwidth.

Big‑data offline frameworks such as Hadoop can handle these scenarios, but they require lengthy computation and custom MapReduce jobs, adding complexity. Hadoop splits large files into many small ones for parallel processing. Linux provides a simple split utility to achieve the same.

The split command supports two ways to divide a file:

By line count using the -l option.

By size using the -b option.

2.1 Split by line count

Example: split a 3.4 GB log file into pieces of 50 000 lines each, naming the pieces split-line with numeric suffixes.

# source file size
[root@VM_3_50_centos split]# ls -l 2020011702-www.happylauliu.cn.gz -h
-rw-r--r-- 1 root root 3.4G 1月 17 09:42 2020011702-www.happylauliu.cn.gz

# split by lines
[root@~]# split -l 50000 -d --verbose 2020011702-www.happylauliu.cn.gz split-line
Creating file "split-line00"
Creating file "split-line01"
...
Creating file "split-line9171"

# verify line count
[root@VM_3_50_centos split]# wc -l split-line00
50000 split-line00
...
[root@VM_3_50_centos split]# wc -l split-line9171
1020 split-line9171

# check file size
[root@VM_3_50_centos split]# ls -lh split-line0[0-9]
-rw-r--r-- 1 root root 14M 1月 17 16:54 split-line00
...

Each piece is about 14 MB, making subsequent analysis easier, though the number of files increases.

2.2 Split by size

Alternatively, split by size using -b. The following example splits the same file into 500 MB chunks.

# split by size
[root@~]# split -b 500M -d --verbose 2020011702-www.happylauliu.cn.gz split-size
Creating file "split-size00"
Creating file "split-size01"
...
# list resulting files
[root@VM_3_50_centos split]# ls -lh split-size0*
-rw-r--r-- 1 root root 500M 1月 17 17:03 split-size00
-rw-r--r-- 1 root root 500M 1月 17 17:03 split-size01
...
-rw-r--r-- 1 root root 444M 1月 17 17:04 split-size06

2.3 Merge multiple files

To recombine split parts, use redirection:

# merge two parts
[root@VM_3_50_centos split]# cat split-size01 split-size02 >two-file-merge
# check merged size
[root@VM_3_50_centos split]# ls -lh two-file-merge
-rw-r--r-- 1 root root 1000M 1月 17 17:20 two-file-merge

While merging large files still incurs performance costs, it can be done as needed.

Source: https://cloud.tencent.com/developer/article/1576576
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Big Datalog analysisShell scriptingfile-handlingsplit command
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.