Understanding Linux Memory Mapping (mmap): API, Implementation, and Use Cases
This article explains Linux memory mapping (mmap), covering its purpose, API parameters, different mapping types, internal kernel implementation, page‑fault handling, copy‑on‑write semantics, practical use cases, and includes a complete Objective‑C example demonstrating file mapping and manipulation.
Overview
Memory mapping is an OS technique that maps a file or device directly into a process's address space, allowing the process to read and write the data as if it were regular memory. It eliminates explicit read/write calls and keeps the mapped region synchronized with the underlying file.
Typical scenarios include handling large files, inter‑process communication, and improving I/O efficiency in network programming.
1. mmap API
void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset);The function creates a new mapping and returns the starting virtual address. If addr is non‑NULL, the mapping starts at that address; otherwise the kernel chooses a free region. length specifies the size of the region. prot defines read/write/execute permissions. fd determines whether the mapping is file‑backed (fd > 0) or anonymous (fd = -1). flags indicate sharing mode (e.g., MAP_SHARED or MAP_PRIVATE) and other attributes.
Combining fd and flags yields four mapping types: shared file, private file, shared anonymous, and private anonymous.
2. Implementation Details
The mmap workflow consists of three main steps:
Obtain an unmapped virtual area with get_unmapped_area.
Set appropriate vm_flags based on file‑backed vs. anonymous and shared vs. private.
Call mmap_region to allocate a vm_area_struct (VMA) and link it into the process's red‑black tree of VMAs.
The kernel does not allocate physical pages at this point; it only records the process's demand for memory. Actual pages are provided lazily on a page‑fault.
3. Page‑Fault Handling
When a process accesses an unmapped page, the CPU raises a page‑fault and the kernel enters do_page_fault. It locates the relevant VMA, checks access permissions, and then calls handle_mm_fault, which eventually invokes handle_pte_fault. handle_pte_fault distinguishes several cases:
If the PTE is not present and is pte_none, it handles anonymous pages ( do_anonymous_page) or file‑backed pages ( do_linear_fault).
If the PTE encodes a swap entry, it calls do_swap_page to swap the page back in.
If the fault is caused by a write to a read‑only COW page, it triggers do_wp_page to perform copy‑on‑write.
For file‑backed mappings, the VMA’s vm_ops is set to generic_file_vm_ops, whose fault method points to filemap_fault. This routine loads the required file data into memory.
4. Copy‑On‑Write (COW)
During fork, the parent and child share the same physical pages, which are marked read‑only. When either process writes to a shared page, a page‑fault occurs, and do_wp_page creates a private copy, allowing the processes to diverge.
5. Example Code (Objective‑C)
// ViewController.m
// TestCode
// Created by zhangdasen on 2020/5/24.
#import "ViewController.h"
#import <sys/mman.h>
#import <sys/stat.h>
@interface ViewController ()
@end
@implementation ViewController
- (void)viewDidLoad {
[super viewDidLoad];
NSString *path = [NSHomeDirectory() stringByAppendingPathComponent:@"test.data"];
NSLog(@"path: %@", path);
NSString *str = @"test str2";
[str writeToFile:path atomically:YES encoding:NSUTF8StringEncoding error:nil];
ProcessFile(path.UTF8String);
NSString *result = [NSString stringWithContentsOfFile:path encoding:NSUTF8StringEncoding error:nil];
NSLog(@"result:%@", result);
}
int MapFile(const char *inPathName, void **outDataPtr, size_t *outDataLength, size_t appendSize) {
int outError = 0;
int fileDescriptor;
struct stat statInfo;
*outDataPtr = NULL;
*outDataLength = 0;
fileDescriptor = open(inPathName, O_RDWR, 0);
if (fileDescriptor < 0) {
outError = errno;
} else {
if (fstat(fileDescriptor, &statInfo) != 0) {
outError = errno;
} else {
ftruncate(fileDescriptor, statInfo.st_size + appendSize);
fsync(fileDescriptor);
*outDataPtr = mmap(NULL, statInfo.st_size + appendSize,
PROT_READ|PROT_WRITE,
MAP_FILE|MAP_SHARED,
fileDescriptor, 0);
if (*outDataPtr == MAP_FAILED) {
outError = errno;
} else {
*outDataLength = statInfo.st_size;
}
}
close(fileDescriptor);
}
return outError;
}
void ProcessFile(const char *inPathName) {
size_t dataLength;
void *dataPtr;
char *appendStr = " append_key2";
int appendSize = (int)strlen(appendStr);
if (MapFile(inPathName, &dataPtr, &dataLength, appendSize) == 0) {
dataPtr = dataPtr + dataLength;
memcpy(dataPtr, appendStr, appendSize);
munmap(dataPtr, appendSize + dataLength);
}
}
@endThe example demonstrates mapping a file, appending data via the mapped region, and then unmapping.
6. Kernel Data Structures Involved
Key structures include file, dentry, inode, and address_space. The inode’s i_mapping points to an address_space that holds a radix tree of page objects, forming the PageCache. Shared mappings use the same physical pages via these structures.
Swap management is represented by struct swap_info_struct, which tracks swap devices, slot counts, and usage maps. A swap entry ( swp_entry_t) encodes the swap device index and offset, allowing the kernel to retrieve swapped‑out pages.
Conclusion
Linux’s mmap mechanism provides a powerful, lazy‑loaded way to access file or anonymous memory, enabling efficient I/O, inter‑process communication, and memory‑conserving techniques such as copy‑on‑write. Understanding the API, kernel pathways, and page‑fault handling is essential for systems programmers and performance‑critical application developers.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Deepin Linux
Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
