Why Redirecting Large MySQL Dumps Is Up to 3× Faster Than Piping – A SystemTap Analysis
A detailed performance comparison shows that using input redirection to load a huge MySQL dump is roughly three times faster than piping the file through cat, because the redirection reads the data only once while the pipe incurs extra reads and a context switch, as demonstrated with SystemTap tracing.
Background
The article investigates the efficiency of two common ways to import a large huge_dump.sql file into MySQL on Linux: piping the file through cat and feeding it to mysql, versus redirecting the file directly into mysql with <.
Test Setup
A dummy program b.out is compiled to simulate MySQL’s consumption of data from stdin:
int main(int argc, char *argv[])
{
while (fread(buf, sizeof(buf), 1, stdin) > 0);
return 0;
}The program is built with: gcc -o b.out b.c A 419 MB file huge_dump.sql is generated (cached in page cache) using:
sudo dd if=/dev/urandom of=huge_dump.sql bs=4096 count=102400SystemTap Tracing Script
A SystemTap script test.stp records the sequence of system calls for the processes involved (bash, b.out, and cat). The script logs open, read, write, pipe, fork, execve, dup, and wait4 events.
function should_log(){
return (execname() == "cat" || execname() == "b.out" || execname() == "bash");
}
probe syscall.open, syscall.close, syscall.read, syscall.write, syscall.pipe, syscall.fork, syscall.execve, syscall.dup, syscall.wait4 {
if (!should_log()) next;
printf("%s -> %s
", thread_indent(0), probefunc());
}
probe kernel.function("pipe_read"), kernel.function("pipe_readv"), kernel.function("pipe_write"), kernel.function("pipe_writev") {
if (!should_log()) next;
printf("%s -> %s: file ino %d
", thread_indent(0), probefunc(), __file_ino($filp));
}
probe begin { println(":~") }Performance Measurements
Using time the two import methods yield:
# Pipe method
real 0m0.596s
user 0m0.001s
sys 0m0.919s
# Redirection method
real 0m0.151s
user 0m0.000s
sys 0m0.147sThe redirection method is about three times faster.
SystemTap Observations – Pipe
bash forks two processes (cat and b.out).
Both processes communicate via a pipe.
Data is read from huge_dump.sql by cat, written to the pipe, then read again by b.out.
SystemTap Observations – Redirection
bash forks a single process that opens the dump file.
The file descriptor is duplicated to stdin (fd 0) before execve runs b.out. b.out reads the data directly from the file, without an intermediate pipe.
Root Cause Analysis
In the pipe scenario the data is read twice (once by cat, once by b.out) and an extra context switch occurs, while the redirection scenario reads the data only once.
Conclusion
Linux under large‑file conditions: input redirection ( < huge_dump.sql ) is significantly more efficient than piping the file through cat .
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
