How Zero‑Copy and DMA Supercharge Data Transfer Performance
This article explains the fundamentals of zero‑copy, DMA, PageCache and RDMA, compares them with traditional I/O, describes Linux implementations such as sendfile, mmap+write, splice and Java NIO APIs, and shows practical use‑cases that dramatically reduce CPU load and latency in high‑throughput networking and file handling.
1. Zero‑Copy Overview
Traditional file transfer involves multiple copies and context switches:
User‑space → kernel‑space (read syscall)
Disk → kernel buffer (DMA)
Kernel buffer → user buffer (CPU copy)
User‑space → kernel‑space (write syscall)
User buffer → socket buffer (CPU copy)
Socket buffer → NIC (DMA)
Zero‑copy aims to minimise copies between user and kernel space and reduce system‑call overhead. The classic Linux example is sendfile, which moves data directly from the kernel’s page cache to the socket buffer.
Benefits
Fewer data copies (often from four to zero CPU copies)
Reduced context‑switch overhead
Lower memory usage in user space
Higher throughput on high‑speed networks (10 Gbit/s+)
2. DMA and RDMA
2.1 Direct Memory Access (DMA)
DMA lets hardware (disk controllers, NICs) transfer data between device and main memory without CPU intervention. Typical DMA workflow:
CPU programs the DMA controller (source, destination, length)
Device issues a DMA request
DMA controller gains bus ownership
Data is moved autonomously
DMA controller raises an interrupt on completion
2.2 Remote Direct Memory Access (RDMA)
RDMA extends DMA across the network, allowing a host to read/write remote memory without involving the local CPU. It eliminates the kernel‑space copy path for network packets, dramatically lowering latency and CPU load. Common RDMA transports are InfiniBand, RoCE (RDMA over Converged Ethernet) and iWARP.
3. PageCache
Linux’s page cache stores disk pages (typically 4 KB) in RAM. On a read, the kernel first checks the cache; a miss triggers a DMA‑driven disk read into the cache. When data is already cached, sendfile can copy it directly from the page cache to the socket buffer, avoiding any user‑space copy.
4. Implementation Techniques
4.1 mmap + write
mmapmaps a file into the process address space. The application then writes the mapped memory to a socket. Only one kernel‑to‑socket copy occurs.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/socket.h>
#include <arpa/inet.h>
#include <fcntl.h>
#include <assert.h>
#include <errno.h>
int main(int argc, char **argv) {
if (argc != 3) {
fprintf(stderr, "usage: %s ip port
", argv[0]);
return 1;
}
const char *ip = argv[1];
int port = atoi(argv[2]);
int sock = socket(AF_INET, SOCK_STREAM, 0);
assert(sock >= 0);
int reuse = 1;
setsockopt(sock, SOL_SOCKET, SO_REUSEADDR, &reuse, sizeof(reuse));
struct sockaddr_in addr = {0};
addr.sin_family = AF_INET;
addr.sin_port = htons(port);
inet_pton(AF_INET, ip, &addr.sin_addr);
assert(bind(sock, (struct sockaddr*)&addr, sizeof(addr)) != -1);
assert(listen(sock, 5) != -1);
struct sockaddr_in client;
socklen_t client_len = sizeof(client);
int conn = accept(sock, (struct sockaddr*)&client, &client_len);
assert(conn >= 0);
int pipefd[2];
assert(pipe(pipefd) != -1);
// splice from socket to pipe and back (zero‑copy echo)
assert(splice(conn, NULL, pipefd[1], NULL, 32768, SPLICE_F_MORE|SPLICE_F_MOVE) != -1);
assert(splice(pipefd[0], NULL, conn, NULL, 32768, SPLICE_F_MORE|SPLICE_F_MOVE) != -1);
close(conn);
close(sock);
return 0;
}4.2 sendfile
Introduced in Linux 2.1, sendfile transfers data from a file descriptor to a socket descriptor. Since Linux 2.4 it uses scatter‑gather DMA, moving data directly from the page cache to the NIC without CPU copies.
4.3 splice and tee
splicecreates a pipe between two file descriptors inside the kernel, moving data without user‑space copies. It is useful for building high‑performance proxies. tee duplicates data from one pipe to another without consuming the source, enabling simultaneous logging or fan‑out.
4.4 FileChannel.transferTo / transferFrom (Java)
Java NIO’s FileChannel.transferTo and transferFrom are thin wrappers around the Linux sendfile system call, providing zero‑copy file transmission in Java.
5. Application Scenarios
Network servers : Web servers such as Nginx use sendfile to serve static files directly from the page cache to the network, reducing CPU load and increasing concurrent throughput.
CDN edge nodes : Zero‑copy speeds up content delivery by avoiding user‑space copies when streaming cached objects.
Large‑file processing : Tools that copy or backup big files (e.g., database backups) benefit from mmap, sendfile, or FileChannel.transferTo to minimise I/O overhead.
High‑performance proxies : splice enables zero‑copy forwarding of traffic between sockets.
6. Java Zero‑Copy APIs
6.1 Memory‑mapped I/O (MappedByteBuffer)
public class MmapTest {
public static void main(String[] args) {
try {
FileChannel readChannel = FileChannel.open(Paths.get("./jay.txt"), StandardOpenOption.READ);
MappedByteBuffer data = readChannel.map(FileChannel.MapMode.READ_ONLY, 0, 40L * 1024 * 1024);
FileChannel writeChannel = FileChannel.open(Paths.get("./siting.txt"),
StandardOpenOption.WRITE, StandardOpenOption.CREATE);
writeChannel.write(data);
readChannel.close();
writeChannel.close();
} catch (Exception e) {
System.out.println(e.getMessage());
}
}
}6.2 Zero‑copy file transfer with transferTo
public class SendFileTest {
public static void main(String[] args) {
try {
FileChannel src = FileChannel.open(Paths.get("./jay.txt"), StandardOpenOption.READ);
long size = src.size();
FileChannel dst = FileChannel.open(Paths.get("./siting.txt"),
StandardOpenOption.WRITE, StandardOpenOption.CREATE);
src.transferTo(0, size, dst);
src.close();
dst.close();
} catch (Exception e) {
System.out.println(e.getMessage());
}
}
}6.3 Zero‑Copy File Server Example
The server below listens on port 8888 and streams a file to each client using FileChannel.transferTo, achieving true zero‑copy transmission.
import java.io.FileInputStream;
import java.io.IOException;
import java.net.InetSocketAddress;
import java.nio.channels.FileChannel;
import java.nio.channels.ServerSocketChannel;
import java.nio.channels.SocketChannel;
public class ZeroCopyFileServer {
public static void main(String[] args) throws IOException {
int port = 8888;
ServerSocketChannel server = ServerSocketChannel.open();
server.bind(new InetSocketAddress(port));
while (true) {
SocketChannel client = server.accept();
String filePath = "your_file_path_here"; // replace with actual path
try (FileInputStream fis = new FileInputStream(filePath);
FileChannel fc = fis.getChannel()) {
long transferred = fc.transferTo(0, fc.size(), client);
System.out.println("Transferred " + transferred + " bytes.");
}
client.close();
}
}
}Deepin Linux
Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
