Multithreaded Parallel Writeback: Vivo’s Exploration of Page Cache Write Acceleration
The article examines Linux's page‑cache writeback mechanism, explains why the single‑threaded writeback path becomes a bottleneck under heavy writes, and details Vivo's multithreaded writeback patches—including inode‑to‑context mapping and sysfs‑controlled thread counts—that achieve up to 2.4 GB/s on XFS and a 22 % speedup on F2FS, while also discussing fragmentation trade‑offs and optimal thread‑to‑allocation‑group ratios.
Background: Linux Page‑Cache Writeback
When a file is written via buffered I/O, Linux stores data in the page cache and flushes dirty pages only after either a timeout expires or the amount of dirty data exceeds a space threshold. Under light write loads the kernel silently writes back data, but a burst of writes can fill the dirty page pool, causing the writing process to block.
Prior Work and Limitations
Community research showed that using large folios for writeback can dramatically increase throughput (e.g., 800 MB/s vs. 2.4 GB/s on XFS/NVMe) because a single thread iterates over larger units. However, this approach does not address the fundamental bottleneck that each block device has only one writeback thread.
Parallel Writeback Patchset
Kundan Kumar’s patchset introduced multiple writeback contexts per block device, turning the single dirty‑inode list into several lists processed by a thread pool whose default size equals the number of CPUs. This multithreaded design removes the serialisation point of a single writeback thread.
Vivo’s Extensions
Building on the above, Vivo’s contributors Wang Yufei and Zhang Xirui focused on two enhancements, both implemented for XFS:
Allow the filesystem to bind specific inodes to particular writeback contexts instead of using a simple modulo mapping, reducing cross‑thread allocation‑group lock contention.
Expose the number of writeback threads via a sysfs entry (
/sys/class/bdi/<major>:<minor>/nwritebacks) so that the count can be tuned independently of CPU count.
Experimental Results
QEMU tests with an emulated 20 GB NVMe SSD (8 CPU cores, 4 GB RAM) showed that setting the writeback thread count equal to the number of allocation groups (AGs) yields the best performance for XFS. On real hardware, Samsung’s measurements reported writeback speeds rising from 800 MB/s to 2.4 GB/s when multiple threads are used.
For F2FS on UFS devices, the authors observed a 22 % performance gain with their parallel writeback implementation.
Fragmentation Trade‑offs
The authors note that binding each inode to a single writeback context can reduce filesystem fragmentation caused by concurrent writes to the same inode, but it also eliminates parallelism for that inode, which may hurt workloads that heavily write a single file.
Even with the single‑inode‑per‑context rule, the final experiments indicated that multithreaded writeback still increased fragmentation overall, suggesting further investigation is needed.
Conclusion
Multithreaded writeback can substantially improve write throughput on modern SSDs and filesystems, provided that the number of writeback threads is tuned to the allocation‑group layout and that inode‑to‑context mapping is managed to balance performance against fragmentation.
References
[1] https://lwn.net/Articles/1024402/
[2] https://sourceforge.net/p/linux-f2fs/mailman/linux-f2fs-devel/thread/20251014120845.2361-1-kundan.kumar%40samsung.com/
[3] https://blog.linuxnews.dev/p/parallelizing-linux-writeback
[4] https://patchew.org/linux/[email protected]/
[5] https://lore.kernel.org/linux-fsdevel/[email protected]/
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
