Operations 27 min read

Mastering Linux Concurrency: From Processes to Epoll and Coroutines

This article explores Linux process and thread fundamentals, traditional network service models, the C10K challenge, and advanced techniques such as epoll, event‑driven designs, and coroutine‑based asynchronous programming to improve scalability and performance.

MaGe Linux Operations

Nov 20, 2014

Mastering Linux Concurrency: From Processes to Epoll and Coroutines

1. Context Switching Techniques

Overview

Before diving deeper, we review various context‑switching techniques. A "context" refers to the program state during execution, typically represented by the call stack, which records the current call level and its environment.

A "context switch" is the act of moving from one execution context to another, while "scheduling" decides which context receives the next CPU time slice.

Process

A process is a classic isolation unit with its own address space and resource handles, preventing interference between processes. Each process is described in the kernel by a process descriptor stored in a task queue.

Creating a new process requires allocating a new descriptor and a new address space (often using copy‑on‑write with the parent).

Process States

Ignoring the kernel's complex state diagram, we can simplify process states to three: ready, running, and sleeping. Ready and running can transition via scheduling; a running process enters sleep when it waits for conditions such as I/O, and returns to ready once the condition is satisfied.

Blocking

When a process attempts I/O on a file descriptor that has no data, it blocks, moving to the sleeping state and being scheduled out. Once data arrives, the process is awakened and placed back on the ready queue.

If multiple contexts block on the same descriptor, traditional behavior wakes all of them, which can lead to the "thundering herd" problem; modern Linux kernels mitigate this by locking the accept path.

Thread

Threads are lightweight processes that share the parent’s address space and descriptor tables, but still require kernel‑mode scheduling for execution.

2. Traditional Network Service Models

Process Model

Each client is assigned a dedicated process. This provides strong isolation—errors in one process do not affect others—but incurs high creation and destruction costs, prompting the use of connection pools.

Thread Model

Each client is assigned a dedicated thread. Threads are lighter weight and communicate faster, yet a fault in a single thread can jeopardize the entire service.

Example

py_http_fork_thread.py

The example demonstrates that process‑based and thread‑based designs can be interchanged.

How it works:

Parent process listens on a server port.

When a new connection arrives, the parent forks a child process.

The child may exec a CGI program.

The parent blocks in accept after forking.

The scheduler selects the newly created child for execution.

The child blocks on read, entering the sleeping state.

When a SYN or data packet arrives, the kernel wakes the blocked context and places it on the run queue.

The context continues until the next block or time‑slice expiration.

Evaluation

Synchronous model is easy to write; each context can act as if others do not exist.

Process model isolates connections, limiting failure impact.

Process creation and destruction overhead is significant; reuse is essential.

Multi‑client communication is cumbersome when large shared data is involved.

Performance

Thread mode on a virtual machine:

1: 909.27 2: 3778.38 3: 4815.37 4: 5000.04 10: 4998.16 50: 4881.93 100: 4603.24 200: 3445.12 500: 1778.26 (error)

Fork mode on a virtual machine:

1: 384.14 2: 435.67 3: 435.17 4: 437.54 10: 383.11 50: 364.03 100: 320.51 (error)

Thread mode on a physical machine:

1: 6942.78 2: 6891.23 3: 6584.38 4: 6517.23 10: 6178.50 50: 4926.91 100: 2377.77

Although Python has a GIL, a thread blocked on network I/O releases the GIL, allowing other threads to run. On virtual machines the kernel‑mode overhead is higher, which explains the differing results.

3. The C10K Problem

Description

When concurrent connections approach 10 000, traditional process‑or‑thread models degrade sharply; performance drops dramatically beyond about 1 000 connections.

Issues with the Process Model

Spawning and destroying thousands of processes is prohibitively expensive. Even pre‑forked process pools cannot fully avoid the overhead.

Issues with the Thread Model

Threads consume stack memory (default 8 MiB on 32‑bit Linux). Although Linux allocates stack pages lazily, the virtual address space can become exhausted on 32‑bit systems; on modern 64‑bit systems this is less of a concern.

Kernel‑Mode Overhead

Switching from user to kernel mode for I/O incurs a small but non‑zero cost (on the order of 10 clock cycles). The cost is similar across models; the real bottleneck is the frequency of I/O‑induced switches.

Thread‑Switching Cost

Linux schedulers have evolved from the 2.4 scheduler to O(1) and finally to the Completely Fair Scheduler (CFS). CFS uses a red‑black tree to manage the run queue, resulting in O(log n) overhead per scheduling decision, where n is the number of active threads.

Analysis

Even O(log n) overhead becomes significant because I/O blocking occurs frequently. As response times increase, more threads remain active, further amplifying the cost.

4. Multiplexing with epoll

Overview

To overcome C10K, the number of active contexts must be reduced. This is achieved by handling many connections within a single context using non‑blocking I/O and readiness notification mechanisms such as epoll.

epoll

epoll creates an epoll file descriptor and registers other descriptors with it. It supports Level‑Triggered (LT) and Edge‑Triggered (ET) modes. LT returns all ready descriptors each call; ET returns only newly ready descriptors since the last call.

epoll internally uses a red‑black tree for registration and a double‑ended queue for the ready list.

Performance

With non‑blocking I/O, context switches caused by blocking disappear, and epoll operations are O(1). However, adding or removing descriptors incurs O(log n) due to the red‑black tree, which can affect short‑lived connections.

Limitations

epoll cannot monitor regular files because they are always ready; frameworks such as Go work around this by delegating file I/O to separate helper threads.

5. Programming Models under Event Notification

Overview

Event‑driven designs require user‑level scheduling because the kernel does not automatically wake the appropriate context after I/O completes.

User‑Space Scheduling

Frameworks maintain a map from file descriptors to objects; when epoll reports readiness, the framework wakes the associated object for processing.

Coroutines

Coroutines enable context switches without kernel involvement, allowing a coroutine to be bound to a descriptor and resumed when the descriptor becomes ready.

Implementation Details

Typical coroutine implementations save registers (setjmp/longjmp) or use makecontext/swapcontext, each with trade‑offs regarding stack handling and performance.

Relation to Threads

Coroutines run inside threads; a single thread can host many coroutines, which appear synchronous as long as they do not encounter blocking calls.

Callback‑Passing Style (CPS) Model

In the CPS model, every operation returns its result via a callback, effectively turning all actions into asynchronous I/O. mul(lambda x: add(pprint.pprint, x, 1), 2, 3) where add = lambda f, *nums: f(sum(nums)) and mul = lambda f, *nums: f(reduce(lambda x,y: x*y, nums)). Python lacks tail‑call optimization, so this style can generate many frames.

State‑Machine Model

State machines break work into steps and re‑enter the function after each I/O event, similar to writing notes for a patient with periodic amnesia.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Linux Coroutines processes Threads

Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.