Understanding Linux I/O Schedulers: NOOP, CFQ, Deadline, and Anticipatory
This article explains the four Linux kernel I/O schedulers—NOOP, Anticipatory, Deadline, and CFQ—covering their design goals, how they manage request queues through merging and sorting, and when each scheduler is best suited for different storage hardware and workloads.
Linux I/O Scheduler Overview
The Linux kernel provides four built‑in I/O schedulers: NOOP , CFQ (Completely Fair Queuing), DEADLINE and ANTICIPATORY . An I/O scheduler determines the order in which block‑device requests are submitted to the device, with the goal of maximizing throughput while minimizing latency.
Fundamental Concepts
When a process reads from or writes to a block device, the request is placed in a per‑device request queue ( request_queue).
Each block device (or its partition) maintains its own queue.
The scheduler may reorder, merge, or split requests before they reach the device driver, turning unordered I/O into an ordered, more efficient stream.
Before scheduling, the kernel counts the total number of pending requests in the queue.
The two primary mechanisms used by all schedulers are:
Merge: Adjacent requests that target consecutive sectors are combined into a single larger request, reducing the number of I/O operations.
Sort: Requests are ordered by sector number (or by deadline) to minimise head movement on rotating media.
1. NOOP Scheduler
NOOP implements a simple FIFO queue with minimal processing:
If a new request can be merged with the previous one, it is merged.
If merging is not possible, the scheduler attempts to insert the request in sector order; if that fails, it appends the request to the tail of the queue.
Typical use cases: devices that already perform internal scheduling (e.g., hardware RAID, NAS), applications that issue well‑ordered I/O, and non‑rotational storage such as SSDs where seek latency is negligible.
Because NOOP does not attempt to optimise seek patterns, it is the preferred choice for SSDs and other flash‑based devices.
2. CFQ (Completely Fair Queuing)
CFQ creates a separate request queue for each process and assigns a time slice (service budget) to that queue. While a process holds its slice, its I/O requests are dispatched to the underlying block device. When the slice expires, the queue is paused and another process’s queue is serviced.
CFQ respects three priority classes (high‑to‑low): RT (real‑time), BE (best‑effort) and IDLE . Each class can be further divided into eight sub‑priorities. The ionice command can be used to view or change a process’s I/O priority; higher priority results in a larger time slice and a larger number of requests processed per slice.
Synchronous (blocking) requests are placed in the per‑process queue, while asynchronous requests from all processes share a common pool of 17 queues (8 RT + 8 BE + 1 IDLE). Since Linux 2.6.18, CFQ is the default scheduler and works well for general‑purpose workloads, but benchmarking is recommended for specific use cases.
3. DEADLINE Scheduler
DEADLINE adds explicit FIFO queues for reads and writes on top of the basic sorting performed by CFQ. Each queue has a configurable maximum wait time (default 500 ms for reads, 5 s for writes). The scheduler always services the read FIFO first, then the write FIFO, and finally the CFQ‑style queue:
FIFO(Read) > FIFO(Write) > CFQ
This design prevents starvation of latency‑sensitive reads and is well‑suited for database logging or other workloads where read latency is critical. The timeout values can be tuned via /sys/block/<em>dev</em>/queue/read_expire and /sys/block/<em>dev</em>/queue/write_expire.
4. ANTICIPATORY Scheduler
ANTICIPATORY builds on DEADLINE by adding a short wait window (default 6 ms) after a read request. If another read for an adjacent sector arrives within this window, the scheduler immediately services it, effectively anticipating sequential reads that follow a random read. This improves performance for mixed random‑and‑sequential workloads.
Choosing the Appropriate Scheduler
The optimal scheduler depends on the storage medium and workload characteristics:
Rotating disks (SAS/HDD): CFQ, DEADLINE or ANTICIPATORY can provide good performance; DEADLINE often yields the best latency‑throughput balance for database servers.
Solid‑state drives (SSD, NVMe, Fusion‑IO): NOOP is usually the most efficient because there is no mechanical seek latency to hide.
Workloads with heavy synchronous writes (e.g., fsync‑intensive logging): Avoid DEADLINE if the write queue’s long timeout could delay fsync completion.
Illustrative Diagrams
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
