Fundamentals 8 min read

How mmap Supercharges File I/O by Cutting System Calls and Data Copies

mmap maps files directly into a process’s virtual memory, eliminating the double‑copy between kernel and user space and reducing costly read/write system calls, which boosts I/O performance, simplifies code, but requires careful handling of address space limits, page faults, and concurrency.

Liangxu Linux
Liangxu Linux
Liangxu Linux
How mmap Supercharges File I/O by Cutting System Calls and Data Copies

Why traditional read/write I/O is slow

Traditional I/O uses two copies: disk → kernel page cache, then kernel cache → user buffer. Each read/write triggers a system call, causing context switches and high CPU/memory usage, especially for large files.

Traditional I/O diagram
Traditional I/O diagram

Disk → kernel page cache

Kernel cache → user space buffer

How mmap bypasses the double‑copy

mmap maps a file directly into the process’s virtual address space, turning file I/O into ordinary memory accesses. The kernel and user share the same physical pages, eliminating the second copy.

mmap diagram
mmap diagram

Reduced system calls

After a single mmap call, the program reads data with normal memory instructions, avoiding repeated read/write calls. Example code compares the number of system calls for traditional I/O versus mmap.

// 传统IO方式读取文件
void read_file_traditional(const char* filename){
    int fd = open(filename, O_RDONLY);
    ...
    // 循环读取文件内容,每次都需要系统调用
    while ((n = read(fd, buf, sizeof(buf))) > 0) {
        ...
    }
}

// mmap方式读取文件
void read_file_mmap(const char* filename){
    int fd = open(filename, O_RDONLY);
    ...
    // 只需一次mmap系统调用
    char* addr = mmap(NULL, size, PROT_READ, MAP_PRIVATE, fd, 0);
    unsigned long sum = 0;
    // 直接通过内存访问文件内容,无需系统调用
    for (size_t i = 0; i < sb.st_size; i++) {
        sum += addr[i];
    }
}

Simplified programming model

Because the file appears as a memory array, code can search or process data with simple loops, as shown in a file‑search example.

// 传统IO方式搜索文件内容
void search_file_traditional(const char* filename, const char* pattern){
    int fd = open(filename, O_RDONLY);
    char buf[4096];
    ssize_t n;
    // 需要手动管理缓冲区,循环读取文件
    while ((n = read(fd, buf, sizeof(buf))) > 0) {
        // 在缓冲区中查找模式串
        for (ssize_t i = 0; i < n; i++) {
            if (strncmp(buf + i, pattern, strlen(pattern)) == 0) {
                printf("Found pattern at offset %ld
", lseek(fd, 0, SEEK_CUR) - n + i);
            }
        }
    }
    ...
}

// mmap方式搜索文件内容
void search_file_mmap(const char* filename, const char* pattern){
    int fd = open(filename, O_RDONLY);
    struct stat sb;
    fstat(fd, &sb);
    // 一次映射,直接操作内存
    char* addr = mmap(NULL, sb.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
    // 可以像操作数组一样简单地遍历文件内容
    for (size_t i = 0; i < sb.st_size; i++) {
        if (strncmp(addr + i, pattern, strlen(pattern)) == 0) {
            printf("Found pattern at offset %zu
", i);
        }
    }
    ...
}

Zero‑copy data transfer

mmap shares the same physical pages between kernel and user, so data is copied only once from disk to memory. This reduces memory usage and CPU overhead.

Zero‑copy diagram
Zero‑copy diagram

Limitations and cautions

On 32‑bit systems the address space is limited (typically 4 GB), so mapping very large files can cause fragmentation or exhaustion. Frequent small writes may generate many page faults and TLB misses, making mmap slower than read/write. Real‑time systems must consider unpredictable page‑fault latency, and high‑concurrency scenarios require explicit synchronization (locks or atomic operations) to avoid data races.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

mmapZero Copysystem callsI/O performancememory mapping
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.