Unlocking mmap: How Memory‑Mapped Files Boost Linux Performance
This article explains the mmap system call’s core concepts, mapping process, kernel mechanisms, differences from traditional file I/O, performance benefits, and practical usage details, helping readers understand how memory‑mapped files work in Linux and why they’re advantageous for high‑performance applications.
Preface
Recently a reader interviewed at Huawei was asked about mmap, a topic the company often tests because it probes low‑level principles. The interview was tough, but this article walks through mmap fundamentals and analysis so readers can gain a solid understanding.
1. mmap Basic Concept
mmap is a method of memory‑mapping a file or other object into a process’s address space, establishing a one‑to‑one mapping between a file’s disk address and a range of virtual addresses.
After the mapping is created, the process can read and write the memory region via pointers, while the kernel automatically writes back dirty pages to the file. Modifications in kernel space are reflected in user space, enabling file sharing between processes.
The process virtual address space consists of multiple virtual memory areas (VMAs), each a contiguous region with uniform properties (e.g., text, data, BSS, heap, stack, memory‑mapped area). The memory‑mapped area resides in the gap between the heap and stack.
Linux uses the vm_area_struct structure to represent each VMA. A process may have many vm_area_struct instances linked via a list or tree for fast lookup.
The vm_area_struct contains the start and end addresses, flags, and a vm_ops pointer that references the operations applicable to that region. The mmap system call creates a new vm_area_struct and links it to the file’s physical disk address.
2. mmap Memory‑Mapping Process
The mmap operation can be divided into three stages:
(1) User‑space mapping creation
1. The process calls the user‑space
mmap(void *start, size_t length, int prot, int flags, int fd, off_t offset)function.
2. The kernel searches the process’s virtual address space for a suitable free region.
3. A new vm_area_struct is allocated and initialized for that region.
4. The new VMA is inserted into the process’s VMA list or tree.
(2) Kernel‑space mapping establishment
5. Using the file descriptor, the kernel locates the file’s internal structure.
6. The kernel’s file_operations provides its own mmap implementation: int mmap(struct file *filp, struct vm_area_struct *vma).
7. The kernel resolves the file’s inode to obtain the physical disk address.
8. remap_pfn_range builds the page tables, establishing the mapping between file pages and the virtual address range. At this point no actual data resides in RAM.
(3) Page‑fault handling and data transfer
9. When the process accesses the mapped region, the CPU triggers a page‑fault because the pages are not yet in memory.
10. The kernel’s fault handler determines that the page is missing and initiates a page‑in request.
11. The page‑in routine checks the swap cache; if the page is absent, nopage reads the required page from disk into RAM.
12. The process can now read or write the data. Modified (dirty) pages are eventually written back to disk, either lazily or via an explicit msync() call.
Note: Dirty pages are not written back immediately; msync() forces synchronization.
3. Differences Between mmap and Conventional File I/O
Traditional file I/O follows these steps:
Process issues a read request.
Kernel looks up the file descriptor.
Inode’s address_space checks the page cache; if present, the data is returned.
This involves copying data from disk to the kernel’s page cache and then from the page cache to user space, resulting in two memory copies for each read or write.
With mmap, the kernel creates a VMA and maps the file’s disk pages directly into the process’s address space. When a page fault occurs, the kernel copies the page from disk straight into user space, requiring only a single copy. Consequently, mmap reduces data‑copy overhead and improves performance.
Advantages of mmap
Bypasses the page cache, reducing the number of data copies and speeding up file reads.
Enables direct interaction between user space and kernel space; changes in one space are instantly visible in the other.
Provides a mechanism for inter‑process communication and shared memory, allowing unrelated processes to map the same file or anonymous region.
If multiple processes map the same region, the first fault loads the page from disk; subsequent processes can reuse the already‑loaded page without additional disk I/O.
Facilitates efficient large‑scale data transfer, especially when disk space is used to supplement limited RAM, avoiding excessive I/O operations.
mmap Usage Details
The size of an mmap region must be a multiple of the system page size because memory is managed in page‑sized units.
The kernel tracks the size of the underlying object, so if a file grows, the mapped region can still access the new data without remapping.
Even after the file is closed, the mapping remains valid because it references the disk address, not the file descriptor, making it useful for IPC.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java High-Performance Architecture
Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
