Fundamentals 40 min read

Unlocking Zero‑Copy: How Linux Shared Memory Boosts IPC Performance

This article explains the fundamentals of Linux memory management, details how shared memory implements zero‑copy inter‑process communication, and provides step‑by‑step code examples of system calls, mmap, sendfile, splice, and synchronization techniques for high‑performance data transfer.

Deepin Linux
Deepin Linux
Deepin Linux
Unlocking Zero‑Copy: How Linux Shared Memory Boosts IPC Performance

1. Linux Memory Management Basics

1.1 Virtual and Physical Memory

In Linux, virtual memory gives each process a large, contiguous address space (e.g., 4 GB on 32‑bit systems), abstracting away the fragmented physical RAM. Physical memory is the actual hardware RAM accessed directly by the CPU. The page table maps virtual addresses to physical frames, enabling isolation and efficient use of memory.

(1) Memory Paging

Linux divides both virtual and physical memory into fixed‑size pages (commonly 4 KB). The page table records the mapping from virtual pages to physical pages. When the CPU accesses a virtual address, it extracts the page number, looks up the corresponding physical frame in the page table, and combines it with the page offset to obtain the physical address.

Example: virtual address 0x12345678 with a 4 KB page size yields page number 0x12345 and offset 0x678. If the page table maps 0x12345 to physical frame 0x98765, the resulting physical address is 0x98765678.

(2) Memory Allocation and Release

Processes request memory via system calls such as malloc (which uses brk or mmap) and the kernel manages free lists and bitmap structures. When memory is freed with free or munmap, the kernel returns the block to its free list, allowing reuse.

1.2 Process Address Space Layout

Text Segment : contains executable code, is read‑only and can be shared among processes.

Data Segment : stores initialized global and static variables.

BSS Segment : holds uninitialized globals and statics; occupies no disk space until runtime.

Heap : dynamic allocation area managed by malloc / new.

Stack : stores function parameters, locals, and return addresses; grows downward.

Shared Memory Region : a special area that multiple processes can map to the same physical pages for fast IPC.

2. Linux Shared Memory Mechanism

2.1 What Is Shared Memory?

Shared memory allows several processes to map the same physical memory into their virtual address spaces, enabling direct read/write without copying data between user and kernel buffers. This eliminates the overhead of pipes or message queues, making it ideal for high‑throughput scenarios such as databases and graphics pipelines.

2.2 Relevant System Calls

(1) shmget – creates or obtains a shared‑memory identifier.

#include <sys/ipc.h>
#include <sys/shm.h>
int shmget(key_t key, size_t size, int shmflg);

Parameters: key: unique identifier (often generated with ftok). size: size in bytes. shmflg: flags such as IPC_CREAT, IPC_EXCL, and permission bits (e.g., 0666).

(2) shmat – attaches the shared segment to the process address space.

#include <sys/types.h>
#include <sys/shm.h>
void *shmat(int shmid, const void *shmaddr, int shmflg);

Returns a pointer to the mapped region or (void *)-1 on error.

(3) shmdt – detaches the segment.

#include <sys/types.h>
#include <sys/shm.h>
int shmdt(const void *shmaddr);

(4) shmctl – performs control operations (e.g., IPC_STAT, IPC_SET, IPC_RMID).

#include <sys/ipc.h>
#include <sys/shm.h>
int shmctl(int shmid, int cmd, struct shmid_ds *buf);

2.3 Implementation Details

Kernel structures such as struct shmid_kernel store permissions, size, attachment count, timestamps, and a pointer to a pseudo‑file used for mapping. Each process has a mm_struct that holds its memory‑mapping entries; when shmat is called, a new VMA is added linking the shared physical pages into the process’s page tables.

3. Zero‑Copy Communication Core

3.1 What Is Zero‑Copy?

Zero‑copy avoids copying data between user space and kernel space. Traditional I/O copies data from disk → kernel buffer → user buffer → kernel socket buffer → network, incurring multiple CPU‑intensive memory moves. Zero‑copy keeps data in kernel buffers throughout the transfer, reducing CPU load and latency.

3.2 Zero‑Copy Techniques

(1) mmap + write

Map a file into memory with mmap, then write the mapped region directly to a socket. The DMA engine moves data from disk to kernel buffer, and write copies from that buffer to the socket without touching user space.

#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/socket.h>
#include <arpa/inet.h>
#define FILE_NAME "test.txt"
#define SERVER_IP "127.0.0.1"
#define SERVER_PORT 8888
int main(){
    int file_fd, socket_fd;
    struct stat file_stat;
    char *file_data;
    struct sockaddr_in server_addr;
    file_fd = open(FILE_NAME, O_RDONLY);
    if (file_fd == -1){ perror("open file failed"); exit(EXIT_FAILURE); }
    if (fstat(file_fd, &file_stat) == -1){ perror("fstat failed"); close(file_fd); exit(EXIT_FAILURE); }
    file_data = (char *)mmap(NULL, file_stat.st_size, PROT_READ, MAP_PRIVATE, file_fd, 0);
    if (file_data == MAP_FAILED){ perror("mmap failed"); close(file_fd); exit(EXIT_FAILURE); }
    socket_fd = socket(AF_INET, SOCK_STREAM, 0);
    if (socket_fd == -1){ perror("socket failed"); munmap(file_data, file_stat.st_size); close(file_fd); exit(EXIT_FAILURE); }
    memset(&server_addr, 0, sizeof(server_addr));
    server_addr.sin_family = AF_INET;
    server_addr.sin_port = htons(SERVER_PORT);
    server_addr.sin_addr.s_addr = inet_addr(SERVER_IP);
    if (connect(socket_fd, (struct sockaddr *)&server_addr, sizeof(server_addr)) == -1){
        perror("connect failed"); munmap(file_data, file_stat.st_size); close(file_fd); close(socket_fd); exit(EXIT_FAILURE);
    }
    if (write(socket_fd, file_data, file_stat.st_size) == -1){ perror("write failed"); }
    munmap(file_data, file_stat.st_size);
    close(file_fd);
    close(socket_fd);
    return 0;
}

(2) sendfile sendfile transfers data directly from a file descriptor to a socket descriptor inside the kernel, optionally using SG‑DMA to move data straight to the NIC.

(3) splice splice moves data between two file descriptors via a pipe buffer that stays in kernel space, eliminating user‑space copies. It is useful for proxy servers and large‑scale log processing.

3.3 Advantages of Zero‑Copy

Zero‑copy reduces CPU cycles, memory‑bandwidth consumption, and context‑switch overhead, leading to higher throughput, lower latency, and better resource utilization—critical for real‑time communication, streaming, and high‑performance servers.

4. Combining Shared Memory and Zero‑Copy

4.1 Real‑World Scenarios

Database Systems : Shared memory caches frequently accessed rows; zero‑copy (e.g., mmap) loads disk pages directly into the cache, avoiding extra copies when serving client queries.

Distributed Caches (e.g., Redis) : Nodes share cache data via shared memory; sendfile persists snapshots to disk without user‑space buffering.

Video Streaming : Video files are mmap‑ed into shared memory; processing workers read directly, and sendfile streams the final buffer to clients with zero copies.

4.2 Complete Example

The following C program demonstrates creating a POSIX shared memory object, mapping it, loading a file via mmap, synchronizing with a semaphore, and cleaning up.

#include <sys/ipc.h>
#include <sys/shm.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <semaphore.h>
#define SHM_SIZE 1024
#define FILE_NAME "test.txt"
int main(){
    int shm_fd, file_fd;
    void *shared_mem, *file_mem;
    sem_t *semaphore;
    struct stat file_stat;
    shm_fd = shm_open("/shared_memory", O_CREAT | O_RDWR, 0666);
    if (shm_fd == -1){ perror("shm_open failed"); exit(EXIT_FAILURE); }
    if (ftruncate(shm_fd, SHM_SIZE) == -1){ perror("ftruncate failed"); close(shm_fd); exit(EXIT_FAILURE); }
    shared_mem = mmap(0, SHM_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, shm_fd, 0);
    if (shared_mem == MAP_FAILED){ perror("mmap shared memory failed"); close(shm_fd); exit(EXIT_FAILURE); }
    file_fd = open(FILE_NAME, O_RDONLY);
    if (file_fd == -1){ perror("open file failed"); munmap(shared_mem, SHM_SIZE); close(shm_fd); exit(EXIT_FAILURE); }
    if (fstat(file_fd, &file_stat) == -1){ perror("fstat failed"); close(file_fd); munmap(shared_mem, SHM_SIZE); close(shm_fd); exit(EXIT_FAILURE); }
    file_mem = mmap(0, file_stat.st_size, PROT_READ, MAP_PRIVATE, file_fd, 0);
    if (file_mem == MAP_FAILED){ perror("mmap file failed"); close(file_fd); munmap(shared_mem, SHM_SIZE); close(shm_fd); exit(EXIT_FAILURE); }
    semaphore = sem_open("/semaphore", O_CREAT, 0666, 0);
    if (semaphore == SEM_FAILED){ perror("sem_open failed"); munmap(file_mem, file_stat.st_size); close(file_fd); munmap(shared_mem, SHM_SIZE); close(shm_fd); exit(EXIT_FAILURE); }
    strncpy(shared_mem, file_mem, SHM_SIZE);
    if (sem_post(semaphore) == -1) perror("sem_post failed");
    if (sem_wait(semaphore) == -1) perror("sem_wait failed");
    if (sem_close(semaphore) == -1) perror("sem_close failed");
    sem_unlink("/semaphore");
    munmap(file_mem, file_stat.st_size);
    close(file_fd);
    munmap(shared_mem, SHM_SIZE);
    close(shm_fd);
    shm_unlink("/shared_memory");
    return 0;
}

The program creates shared memory, maps a file, copies data into the shared region, uses a semaphore for synchronization, and finally releases all resources.

5. Pitfalls and Best Practices

5.1 Synchronization and Mutual Exclusion

Concurrent access to shared memory can cause race conditions. Use semaphores or mutexes to serialize reads/writes. For example, a semaphore ensures that only one process writes while others wait, preventing inconsistent data.

5.2 Memory Management and Cleanup

Always detach shared segments with shmdt (or munmap) when done, and remove them with shmctl(..., IPC_RMID, …) or shm_unlink. Failure to do so leads to leaked kernel resources.

5.3 Permission Control

Set appropriate permissions (e.g., 0660 for group‑only access) when creating shared memory to avoid unauthorized reads or writes, especially in multi‑user environments.

memory managementzero-copyShared MemoryIPCsystem calls
Deepin Linux
Written by

Deepin Linux

Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.