Operations 12 min read

Analysis of ext4 Soft Lockup Caused by Extent Status LRU Lock Contention in Linux 3.10

This article examines a Linux 3.10 kernel soft‑lockup bug where the ext4 extent‑status LRU spin‑lock is held for over 20 seconds under memory pressure, explains the ext4 delayed‑allocation mechanism, block lookup process, extent‑status cache shrinkage, and presents the community's mitigation approach.

Tencent Database Technology
Tencent Database Technology
Tencent Database Technology
Analysis of ext4 Soft Lockup Caused by Extent Status LRU Lock Contention in Linux 3.10

The article starts with a real‑world incident where a customer observed a sudden spike in slow queries and missing monitoring data, and kernel logs revealed a soft lockup (CPU#2 stuck for 22 seconds) originating from the ext4_es_lru_add function, indicating that the spin‑lock s_es_lru_lock was held too long.

It then analyses the kernel source of ext4_es_lru_add , showing how the function acquires s_es_lru_lock , updates the LRU list of inodes, and releases the lock. The lock can be contended when the system is under heavy memory pressure because the extent‑status tree is traversed to reclaim many extents.

The article explains ext4's delayed‑allocation mechanism, which postpones block allocation until write‑back, allowing the filesystem to allocate contiguous extents and reduce fragmentation. It also describes the block‑lookup path via ext4_get_block , which builds a struct ext4_map_blocks , calls ext4_map_blocks , and ultimately interacts with the extent‑status tree.

Further, it details how the extent‑status tree is searched ( ext4_es_lookup_extent ) and how missing extents trigger a lookup in the on‑disk extent tree via ext4_ext_find_extent . The structure of an extent status node ( struct extent_status ) and the LRU list management are presented.

The extent‑status cache shrinker ( ext4_es_shrink ) is examined; it locks the LRU list, iterates over inodes, and reclaims written/unwritten extents until a target number is reached. Under high memory pressure, the shrinker runs frequently, holding s_es_lru_lock for a long time, which can cause the observed soft lockup.

To mitigate the issue, the community added a timestamp field ( i_touch_when ) to ext4_inode_info and sorts the LRU list by recent access time, allowing the shrinker to skip recently used inodes and reduce lock contention. The patch also tracks the last sort time with es_stats_last_sorted .

In conclusion, the analysis highlights two lessons: identify potential performance bottlenecks such as long‑held spin‑locks and either replace them with sleeping primitives (e.g., mutexes) or narrow the critical section, and proactively optimise code paths that may become hot under load.

Linux kernelext4extent status treeLRU spin lockmemory pressuresoft lockup
Tencent Database Technology
Written by

Tencent Database Technology

Tencent's Database R&D team supports internal services such as WeChat Pay, WeChat Red Packets, Tencent Advertising, and Tencent Music, and provides external support on Tencent Cloud for TencentDB products like CynosDB, CDB, and TDSQL. This public account aims to promote and share professional database knowledge, growing together with database enthusiasts.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.