Fundamentals 40 min read

How Zero‑Copy and DMA Supercharge Data Transfer Performance

This article explains the fundamentals of zero‑copy, DMA, PageCache and RDMA, compares them with traditional I/O, describes Linux implementations such as sendfile, mmap+write, splice and Java NIO APIs, and shows practical use‑cases that dramatically reduce CPU load and latency in high‑throughput networking and file handling.

Deepin Linux
Deepin Linux
Deepin Linux
How Zero‑Copy and DMA Supercharge Data Transfer Performance

1. Zero‑Copy Overview

Traditional file transfer involves multiple copies and context switches:

User‑space → kernel‑space (read syscall)

Disk → kernel buffer (DMA)

Kernel buffer → user buffer (CPU copy)

User‑space → kernel‑space (write syscall)

User buffer → socket buffer (CPU copy)

Socket buffer → NIC (DMA)

Zero‑copy aims to minimise copies between user and kernel space and reduce system‑call overhead. The classic Linux example is sendfile, which moves data directly from the kernel’s page cache to the socket buffer.

Benefits

Fewer data copies (often from four to zero CPU copies)

Reduced context‑switch overhead

Lower memory usage in user space

Higher throughput on high‑speed networks (10 Gbit/s+)

2. DMA and RDMA

2.1 Direct Memory Access (DMA)

DMA lets hardware (disk controllers, NICs) transfer data between device and main memory without CPU intervention. Typical DMA workflow:

CPU programs the DMA controller (source, destination, length)

Device issues a DMA request

DMA controller gains bus ownership

Data is moved autonomously

DMA controller raises an interrupt on completion

2.2 Remote Direct Memory Access (RDMA)

RDMA extends DMA across the network, allowing a host to read/write remote memory without involving the local CPU. It eliminates the kernel‑space copy path for network packets, dramatically lowering latency and CPU load. Common RDMA transports are InfiniBand, RoCE (RDMA over Converged Ethernet) and iWARP.

3. PageCache

Linux’s page cache stores disk pages (typically 4 KB) in RAM. On a read, the kernel first checks the cache; a miss triggers a DMA‑driven disk read into the cache. When data is already cached, sendfile can copy it directly from the page cache to the socket buffer, avoiding any user‑space copy.

4. Implementation Techniques

4.1 mmap + write

mmap

maps a file into the process address space. The application then writes the mapped memory to a socket. Only one kernel‑to‑socket copy occurs.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/socket.h>
#include <arpa/inet.h>
#include <fcntl.h>
#include <assert.h>
#include <errno.h>

int main(int argc, char **argv) {
    if (argc != 3) {
        fprintf(stderr, "usage: %s ip port
", argv[0]);
        return 1;
    }
    const char *ip = argv[1];
    int port = atoi(argv[2]);
    int sock = socket(AF_INET, SOCK_STREAM, 0);
    assert(sock >= 0);
    int reuse = 1;
    setsockopt(sock, SOL_SOCKET, SO_REUSEADDR, &reuse, sizeof(reuse));
    struct sockaddr_in addr = {0};
    addr.sin_family = AF_INET;
    addr.sin_port = htons(port);
    inet_pton(AF_INET, ip, &addr.sin_addr);
    assert(bind(sock, (struct sockaddr*)&addr, sizeof(addr)) != -1);
    assert(listen(sock, 5) != -1);
    struct sockaddr_in client;
    socklen_t client_len = sizeof(client);
    int conn = accept(sock, (struct sockaddr*)&client, &client_len);
    assert(conn >= 0);
    int pipefd[2];
    assert(pipe(pipefd) != -1);
    // splice from socket to pipe and back (zero‑copy echo)
    assert(splice(conn, NULL, pipefd[1], NULL, 32768, SPLICE_F_MORE|SPLICE_F_MOVE) != -1);
    assert(splice(pipefd[0], NULL, conn, NULL, 32768, SPLICE_F_MORE|SPLICE_F_MOVE) != -1);
    close(conn);
    close(sock);
    return 0;
}

4.2 sendfile

Introduced in Linux 2.1, sendfile transfers data from a file descriptor to a socket descriptor. Since Linux 2.4 it uses scatter‑gather DMA, moving data directly from the page cache to the NIC without CPU copies.

4.3 splice and tee

splice

creates a pipe between two file descriptors inside the kernel, moving data without user‑space copies. It is useful for building high‑performance proxies. tee duplicates data from one pipe to another without consuming the source, enabling simultaneous logging or fan‑out.

4.4 FileChannel.transferTo / transferFrom (Java)

Java NIO’s FileChannel.transferTo and transferFrom are thin wrappers around the Linux sendfile system call, providing zero‑copy file transmission in Java.

5. Application Scenarios

Network servers : Web servers such as Nginx use sendfile to serve static files directly from the page cache to the network, reducing CPU load and increasing concurrent throughput.

CDN edge nodes : Zero‑copy speeds up content delivery by avoiding user‑space copies when streaming cached objects.

Large‑file processing : Tools that copy or backup big files (e.g., database backups) benefit from mmap, sendfile, or FileChannel.transferTo to minimise I/O overhead.

High‑performance proxies : splice enables zero‑copy forwarding of traffic between sockets.

6. Java Zero‑Copy APIs

6.1 Memory‑mapped I/O (MappedByteBuffer)

public class MmapTest {
    public static void main(String[] args) {
        try {
            FileChannel readChannel = FileChannel.open(Paths.get("./jay.txt"), StandardOpenOption.READ);
            MappedByteBuffer data = readChannel.map(FileChannel.MapMode.READ_ONLY, 0, 40L * 1024 * 1024);
            FileChannel writeChannel = FileChannel.open(Paths.get("./siting.txt"),
                    StandardOpenOption.WRITE, StandardOpenOption.CREATE);
            writeChannel.write(data);
            readChannel.close();
            writeChannel.close();
        } catch (Exception e) {
            System.out.println(e.getMessage());
        }
    }
}

6.2 Zero‑copy file transfer with transferTo

public class SendFileTest {
    public static void main(String[] args) {
        try {
            FileChannel src = FileChannel.open(Paths.get("./jay.txt"), StandardOpenOption.READ);
            long size = src.size();
            FileChannel dst = FileChannel.open(Paths.get("./siting.txt"),
                    StandardOpenOption.WRITE, StandardOpenOption.CREATE);
            src.transferTo(0, size, dst);
            src.close();
            dst.close();
        } catch (Exception e) {
            System.out.println(e.getMessage());
        }
    }
}

6.3 Zero‑Copy File Server Example

The server below listens on port 8888 and streams a file to each client using FileChannel.transferTo, achieving true zero‑copy transmission.

import java.io.FileInputStream;
import java.io.IOException;
import java.net.InetSocketAddress;
import java.nio.channels.FileChannel;
import java.nio.channels.ServerSocketChannel;
import java.nio.channels.SocketChannel;

public class ZeroCopyFileServer {
    public static void main(String[] args) throws IOException {
        int port = 8888;
        ServerSocketChannel server = ServerSocketChannel.open();
        server.bind(new InetSocketAddress(port));
        while (true) {
            SocketChannel client = server.accept();
            String filePath = "your_file_path_here"; // replace with actual path
            try (FileInputStream fis = new FileInputStream(filePath);
                 FileChannel fc = fis.getChannel()) {
                long transferred = fc.transferTo(0, fc.size(), client);
                System.out.println("Transferred " + transferred + " bytes.");
            }
            client.close();
        }
    }
}
Zero‑Copy illustration
Zero‑Copy illustration
Traditional I/O data flow
Traditional I/O data flow
RDMA architecture
RDMA architecture
RDMA hardware diagram
RDMA hardware diagram
mmap+write workflow
mmap+write workflow
Zero‑Copy vs Traditional I/O
Zero‑Copy vs Traditional I/O
PerformanceLinuxDMAzero-copyNetworkingRDMAJava NIO
Deepin Linux
Written by

Deepin Linux

Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.