Why io_uring’s Multishot Feature Could End Traditional Event Loops by 2025

This article traces the evolution from select and poll to epoll and finally io_uring, explains how multishot accept and receive work, compares benchmark results, and provides practical guidance for migrating production servers from epoll‑based event loops to the newer io_uring model.

Open Source Tech Hub
Open Source Tech Hub
Open Source Tech Hub
Why io_uring’s Multishot Feature Could End Traditional Event Loops by 2025

Introduction

High‑performance network servers have traditionally used a reactor‑style event loop based on select(), poll() or epoll(). These APIs require a system call for each readiness notification and scale poorly when the number of file descriptors grows. Linux 5.1 introduced io_uring, a proactor‑style asynchronous I/O framework that reduces per‑operation system calls, enables batch submission, and provides a unified interface for sockets and files.

Linux Asynchronous I/O History

select() and poll()

select()

(BSD 4.2, 1983) scans a fixed‑size bitmap of descriptors on every call, giving O(n) complexity and a hard limit (often 1024 FDs). poll() replaces the bitmap with a variable‑length struct pollfd array, but still requires O(n) scans.

epoll

Linux 2.5.45 (2002) added epoll. Applications register interest with epoll_ctl; the kernel stores the interests in a red‑black tree and returns only ready events with epoll_wait in O(1) time. Each ready event still requires a separate read/write system call.

Linux AIO (posix AIO)

Linux AIO ( io_submit) works only with direct‑I/O files and does not support sockets, leaving a gap that io_uring fills.

io_uring Fundamentals

io_uring creates two shared ring buffers: the Submission Queue (SQ) and the Completion Queue (CQ). An application writes an operation descriptor (SQE) into the SQ and issues a single io_uring_enter system call to submit one or many SQEs. The kernel processes the requests asynchronously and posts results to the CQ, which the application retrieves with io_uring_wait_cqe() or by polling the CQ directly. This design eliminates the per‑operation system call overhead of the reactor model.

Multishot Accept and Receive

Multishot Accept

Linux 5.19 added io_uring_prep_multishot_accept(). A single SQE remains active after each new connection, emitting a CQE for every client. The request is cancelled only explicitly or on error. The CQE carries the IORING_CQE_F_MORE flag while more connections are pending.

Multishot Receive

Linux 6.0 introduced io_uring_prep_recv_multishot(). One receive SQE stays alive and generates a CQE for each incoming packet. Applications must provide a pool of buffers and set the IOSQE_BUFFER_SELECT flag. For each packet the kernel selects a buffer, fills it, and returns a CQE containing an io_uring_recvmsg_out structure that describes the buffer index and the number of bytes received. The buffer can be returned to the pool for reuse. The CQE also carries IORING_CQE_F_MORE until the multishot operation ends (e.g., socket close or buffer exhaustion).

Benchmark Highlights

Typical measurements on a 1000‑connection echo server show:

Throughput : multishot receive improves QPS by 6‑8 % over single‑shot receives and up to 10 % over epoll.

Latency : p99 and p99.9 tail latency are reduced by 25‑30 % compared with epoll under saturation.

Standard tools such as wrk or iperf can reproduce these results when the kernel version supports the required io_uring features.

Benchmark chart
Benchmark chart

Impact on Event‑Loop Architecture

Adopting io_uring shifts a server from a reactor loop (register‑then‑react) to a proactor model (submit‑then‑complete). The kernel drives completion, so the application no longer needs to re‑issue accept() or recv() after each event. This simplifies code, reduces system‑call overhead, and enables true multi‑core scaling because multiple threads can submit SQEs concurrently.

Integration challenges:

Many cross‑platform libraries (libuv, asyncio, Java NIO) still rely on epoll for sockets; a Linux‑specific code path or a library that wraps io_uring (e.g., liburing, Rust tokio-uring, C++ liburing, or glommio) is required to reap the full benefits.

Fallback to epoll should be implemented for environments where io_uring is unavailable or disabled by security policies.

Practical Migration Guide

Kernel version : Use Linux 6.0+ for multishot receive or 5.19+ for multishot accept.

Library support : Prefer the official liburing C library. Language‑specific bindings include tokio-uring (Rust), glommio (Rust), and Python wrappers built on liburing.

Two‑stage migration

Stage 1 – Replace epoll_wait with an io_uring poll operation (e.g., IORING_OP_POLL_ADD or multishot poll) to cut the epoll‑related system calls.

Stage 2 – Move the actual read/write operations to io_uring, adopt multishot accept/receive, and manage a provided‑buffer pool.

Reproducible testing : Run identical workloads on the epoll and io_uring versions; record QPS, CPU utilization, and latency percentiles (p50, p99, p99.9).

Buffer management : Allocate a sufficient number of buffers (e.g., thousands) sized for the expected message size. Monitor for -ENOBUFS errors, which indicate pool exhaustion.

Advanced tuning : Consider SQ polling threads, registered files, and CPU affinity for ultra‑low‑latency deployments.

Fallback plan : Provide a runtime switch that falls back to epoll if io_uring encounters kernel bugs or security restrictions.

Reference Implementations

DylanZA/netbench

– network benchmark suite that includes multishot receive tests. frevib/io_uring-echo-server – open‑source echo server using io_uring; useful for side‑by‑side epoll comparison. alexhultman/io_uring_epoll_benchmark – early benchmark showing the evolution of io_uring performance.

References

[1]
io_uring_cqe_seen()

– https://man7.org/linux/man-pages/man3/io_uring_cqe_seen.3.html

Performanceio_uringLinuxepollasynchronous I/Omultishot
Open Source Tech Hub
Written by

Open Source Tech Hub

Sharing cutting-edge internet technologies and practical AI resources.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.