Fundamentals 50 min read

What Really Happens Inside Linux System Calls? A Deep Dive

This article explains Linux system calls as the bridge between user‑space programs and the kernel, covering their purpose, importance, the distinction between user and kernel space, the execution process, classifications, invocation methods, and performance considerations, with clear examples in C.

Deepin Linux
Deepin Linux
Deepin Linux
What Really Happens Inside Linux System Calls? A Deep Dive

1. What is a Linux system call?

In Linux, a system call is the interface the kernel provides to user‑space programs, acting as a bridge that lets programs request privileged operations such as file I/O, process creation, or memory allocation.

1.1 Overview

System calls are a set of kernel‑provided interfaces that allow user programs to request services like accessing hardware resources, managing the file system, or controlling processes.

1.2 Importance

System calls are fundamental to the operating system, enabling resource management, process control, file operations, and device management.

1.3 Why They Exist

The kernel implements a collection of sub‑routines called system calls; they run in kernel mode, while ordinary library functions run in user mode.

1.4 Role of System Calls

System calls provide a uniform hardware abstraction, enforce stability and security, and enable the kernel to manage resources safely.

2. User Space vs. Kernel Space

In a 32‑bit Linux system the virtual address space is split: the lower 3 GB (0x00000000‑0xBFFFFFFF) is user space, the upper 1 GB (0xC0000000‑0xFFFFFFFF) is kernel space.

2.1 User Space

User space is where applications run with limited privileges; they cannot directly access hardware or privileged CPU instructions.

2.2 Kernel Space

Kernel space is the execution area of the operating system core, with full access to hardware and the ability to manage processes, memory, and devices.

2.3 Why Separate Spaces

Separating user and kernel space prevents applications from executing dangerous privileged instructions, improving stability and security.

3. Execution Process of a System Call

The execution of a Linux system call involves several steps from the user program to the kernel and back.

3.1 Preparation

Before invoking a system call the program sets the system‑call number and its arguments (e.g., in registers such as EAX, EBX, ECX, EDX on x86).

3.2 Triggering the Call

On x86 the traditional instruction is int 0x80; newer kernels use the faster syscall instruction. ARM uses svc.

3.3 Entering Kernel Mode

The CPU saves the user context, looks up the interrupt vector, and jumps to the kernel entry point.

3.4 Parameter Checking

The kernel validates argument types, ranges, and pointer validity to prevent crashes or security breaches.

3.5 Executing the Kernel Function

Using the system‑call number, the kernel indexes the system‑call table to find the corresponding function and runs it.

3.6 Returning to User Mode

After execution the kernel places the return value in a register (e.g., EAX) and restores the saved user context, allowing the program to continue.

4. Classification of Linux System Calls

System calls can be grouped by functionality.

4.1 File Operations

Functions such as open, read, write, and close manage files.

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdio.h>

int main() {
    int fd = open("example.txt", O_RDWR | O_CREAT, 0644);
    if (fd == -1) {
        perror("Error opening file");
        return 1;
    }
    // further file operations
    close(fd);
    return 0;
}

4.2 Process Control

Calls like fork, exec, wait, and exit manage process lifecycles.

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>

int main() {
    pid_t pid = fork();
    if (pid < 0) {
        perror("fork error");
        exit(EXIT_FAILURE);
    } else if (pid == 0) {
        printf("I am the child, pid %d, parent %d
", getpid(), getppid());
    } else {
        printf("I am the parent, pid %d, child %d
", getpid(), pid);
    }
    return 0;
}

4.3 Memory Management

Calls such as mmap, brk, and sbrk allocate and map memory.

#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <unistd.h>

#define FILE_SIZE 1024

int main() {
    int fd = open("example.txt", O_RDWR | O_CREAT, 0644);
    if (fd == -1) { perror("open"); return 1; }
    char *addr = mmap(NULL, FILE_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
    if (addr == MAP_FAILED) { perror("mmap"); close(fd); return 1; }
    sprintf(addr, "This is a test
");
    munmap(addr, FILE_SIZE);
    close(fd);
    return 0;
}

4.4 Device Control

Calls like ioctl configure hardware devices.

#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <termios.h>
#include <unistd.h>

int main() {
    int fd = open("/dev/ttyS0", O_RDWR | O_NOCTTY | O_NDELAY);
    if (fd == -1) { perror("open"); return 1; }
    struct termios options;
    tcgetattr(fd, &options);
    cfsetispeed(&options, B9600);
    cfsetospeed(&options, B9600);
    options.c_cflag |= (CLOCAL | CREAD);
    options.c_cflag &= ~PARENB;
    options.c_cflag &= ~CSTOPB;
    options.c_cflag &= ~CSIZE;
    options.c_cflag |= CS8;
    tcsetattr(fd, TCSANOW, &options);
    close(fd);
    return 0;
}

5. Ways to Invoke System Calls in Linux

5.1 Through glibc Library Functions

glibc wraps system calls in convenient APIs (e.g., open, printf, malloc).

5.2 Directly Using syscall

The generic syscall function lets a program invoke any system call by number.

#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <errno.h>

int main() {
    int rc = syscall(SYS_chmod, "/etc/passwd", 0444);
    if (rc == -1) perror("chmod");
    return 0;
}

5.3 Inline Assembly with int 0x80

Programs can embed the interrupt instruction directly.

#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <errno.h>

int main() {
    long rc;
    char *file = "/etc/passwd";
    unsigned short mode = 0444;
    asm ("int $0x80" : "=a" (rc) : "0" (SYS_chmod), "b" ((long)file), "c" ((long)mode));
    if ((unsigned long)rc >= (unsigned long)-132) { errno = -rc; rc = -1; }
    if (rc == -1) perror("chmod");
    return 0;
}

6. Implementation Mechanism of System Calls

6.1 Triggering

User programs invoke a soft‑interrupt ( int 0x80) or the syscall instruction, causing the CPU to switch to kernel mode.

6.2 Flow

The kernel saves the user context, looks up the system‑call number in the system‑call table, validates arguments, executes the corresponding kernel function, stores the return value, restores the user context, and returns to user mode.

6.3 Context Switch

The transition between user and kernel mode isolates applications, preserving system stability.

6.4 System‑Call Table

The table is an array of function pointers indexed by system‑call numbers (e.g., sys_call_table[__NR_write] = sys_write).

7. Relationship Between System Calls and Library Functions

7.1 Library Wrappers

Standard C library functions such as fopen, fread, and fwrite internally invoke open, read, and write system calls.

7.2 Differences and Connections

System calls run in kernel mode and provide low‑level services; library functions run in user mode, offering higher‑level, easier‑to‑use abstractions.

7.3 Overhead and Optimisation

Each system‑call incurs a user‑to‑kernel context switch. Reducing the number of calls, using batch interfaces ( writev, readv), or memory‑mapping files with mmap can mitigate this overhead.

8. Practical Examples

8.1 File‑Operation Example

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>

#define FILE_PATH "example.txt"
#define BUFFER_SIZE 1024

int main() {
    int fd = open(FILE_PATH, O_RDWR | O_CREAT, S_IRUSR | S_IWUSR);
    if (fd == -1) { perror("open"); return 1; }
    const char *content = "This is some sample content to write to the file.";
    ssize_t written = write(fd, content, strlen(content));
    if (written == -1) { perror("write"); close(fd); return 1; }
    lseek(fd, 0, SEEK_SET);
    char buf[BUFFER_SIZE];
    ssize_t read_bytes = read(fd, buf, BUFFER_SIZE-1);
    if (read_bytes == -1) { perror("read"); close(fd); return 1; }
    buf[read_bytes] = '\0';
    printf("Read: %s
", buf);
    close(fd);
    return 0;
}

8.2 Process‑Control Example

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>

int main() {
    pid_t pid = fork();
    if (pid < 0) { perror("fork"); exit(EXIT_FAILURE); }
    else if (pid == 0) {
        char *args[] = {"/bin/ls", "-l", NULL};
        execvp(args[0], args);
        perror("execvp"); exit(EXIT_FAILURE);
    } else {
        int status;
        wait(&status);
        if (WIFEXITED(status))
            printf("Child exited with status %d
", WEXITSTATUS(status));
        else if (WIFSIGNALED(status))
            printf("Child killed by signal %d
", WTERMSIG(status));
    }
    return 0;
}
KernelprogrammingLinuxUser Spacesystem calls
Deepin Linux
Written by

Deepin Linux

Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.