Fundamentals 5 min read

How One Line of Code Boosted Linux Kernel Memory Allocation by 40×

A single commit to the Linux kernel aligned transparent huge pages with PMD boundaries, delivering a near‑40‑fold increase in 1‑byte malloc throughput and illustrating how precise memory‑management tweaks can dramatically improve system performance.

Liangxu Linux
Liangxu Linux
Liangxu Linux
How One Line of Code Boosted Linux Kernel Memory Allocation by 40×
Source: Linux LKML public inbox (link1, link2)

Performance boost observed in the will‑it‑scale benchmark

The benchmark was run on a server equipped with four Intel Xeon Platinum 8380H (Cooper Lake) sockets, providing 224 hardware threads. In the synthetic will‑it‑scale test, the throughput of 1‑byte malloc increased by 3889 % (approximately 40×) after a single kernel commit.

Kernel commit that enabled the improvement

Commit ID: d4148aeab412432bf928f311eca8a2ba52bb05df Change summary: Align transparent huge pages (THP) with the page‑middle directory (PMD) in the memory‑management (mm) and mmap paths.

Background: earlier THP alignment and its regression

A previous commit efa7df3e3bb5 started aligning large anonymous mappings to THP boundaries. While this can improve performance for large allocations, it introduced severe regressions on workloads that created many small regions (e.g., the cactusBSSN benchmark). Those workloads generated numerous 4632 kB mappings that were split and aligned to PMD boundaries, causing TLB and cache‑aliasing penalties and performance drops of up to 600 %.

Regression fix

The new change modifies the alignment logic so that a mapping size must be an exact multiple of the PMD size, not merely “greater than or equal to” a PMD. By enforcing size % PMD_SIZE == 0, the kernel avoids creating irregularly sized mappings that trigger the TLB/cache issues, thereby restoring the original performance of the affected benchmarks.

Practical significance

The 40× gain was measured in a synthetic test; real‑world applications may see smaller but still meaningful improvements. The result demonstrates how a precise adjustment in the kernel’s memory‑management code can have a dramatic impact on allocation throughput, especially in high‑load, high‑performance environments.

Performance optimizationmemory managementLinux kernelpmdTHPkernel commit
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.