How Zero‑Copy and PageCache Supercharge File Transfer Performance
This article explains why a naïve 32 KB‑chunk file transfer incurs excessive context switches and memory copies, and how zero‑copy, PageCache, asynchronous I/O, and direct I/O techniques dramatically reduce overhead and boost throughput for large‑scale data transfers.
When a server needs to send a file to a client, a straightforward implementation reads the file in small buffers (e.g., 32 KB) and issues a separate read and write system call for each chunk. For a 320 MB file this means 10 000 iterations, resulting in roughly 40 000 user‑kernel context switches and four times the original data volume being copied in memory (about 1.28 GB).
Why the naïve approach is inefficient
Each 32 KB chunk triggers two system calls, causing a user‑to‑kernel and kernel‑to‑user transition. Although a single switch costs only tens of nanoseconds to a few microseconds, the cumulative cost becomes significant under high concurrency. Additionally, the repeated memory copies waste CPU cycles and increase latency.
Zero‑copy: merging operations inside the kernel
Zero‑copy eliminates the user‑space buffer by passing the file descriptor and the TCP socket directly to a kernel routine. The kernel moves data from the PageCache to the socket buffer, reducing the number of context switches to two (one for the combined operation) and memory copies to three. If the network card supports SG‑DMA, the copy to the socket buffer can be omitted, leaving only two memory copies.
For the same 320 MB transfer with a 1.4 MB socket buffer, zero‑copy reduces context switches to roughly 400 and memory copies to about 640 MB, more than doubling performance and lowering CPU usage.
PageCache: the OS’s disk‑to‑memory cache
PageCache stores recently accessed disk blocks in RAM, using LRU eviction and read‑ahead prefetching to accelerate subsequent reads. While it improves read latency for most workloads, large files can monopolize the cache, evicting hot small files and causing unnecessary copies.
In high‑concurrency scenarios where large files dominate, it is better to bypass PageCache and use direct I/O for those files.
Asynchronous I/O and Direct I/O
Asynchronous I/O splits a read into a request phase (non‑blocking) and a completion notification phase, allowing the process to perform other work while the kernel fetches data. However, async I/O does not use PageCache, so it cannot benefit from cache‑based optimizations.
Direct I/O explicitly bypasses PageCache, sending data straight from disk to user buffers. It is useful when the application already implements its own caching (e.g., databases) or when transferring very large files that would otherwise pollute the cache.
Combining async I/O with direct I/O lets large files be transferred without blocking and without cache interference, while small files can still profit from zero‑copy.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
