Why Switching Linux Page Size to 2 MiB Can Skyrocket Performance
The article explains how the default 4 KiB pages cause frequent TLB misses, how using 2 MiB huge pages expands a single TLB entry’s coverage by 512×, reduces page‑walk depth and page‑table overhead, and provides C++ examples for both hugetlbfs and Transparent Huge Pages.
Building on a previous discussion of TLB misses, the author shows that on 64‑bit x86 Linux the default 4 KiB page size forces the CPU to walk a multi‑level page table for every memory access, quickly exhausting the few thousand TLB slots per core.
Assuming roughly 2 000 TLB entries per core, the default page size limits effective coverage to about 8 MiB; any workload that accesses more memory triggers massive TLB churn. Switching to 2 MiB pages expands the memory covered by a single TLB entry 512 times , dramatically lowering TLB miss rates, shortening page‑walk depth, shrinking page‑table size (a 1 TiB system can have a page table of several gigabytes with 4 KiB pages), and even improving CPU prefetch behavior.
To form a 2 MiB huge page the kernel must find 512 contiguous 4 KiB physical pages, which can be difficult on fragmented systems.
Linux provides two mechanisms for huge pages:
HugeTLBfs : a dedicated huge‑page pool allocated at boot. Applications must request it explicitly, e.g., by passing MAP_HUGETLB (or SHM_HUGETLB) to mmap() or shmget().
Transparent Huge Pages (THP) : an automatic kernel feature introduced over a decade ago. The khugepaged daemon merges regular pages into huge pages when possible. THP can operate in two modes: always (kernel merges whenever it can) or madvise (merge only for regions explicitly marked with madvise(MADV_HUGEPAGE)).
Below are minimal C++ snippets that demonstrate both approaches.
#include <iostream>
#include <string_view>
#include <system_error>
#include <sys/mman.h>
#include <unistd.h>
constexpr std::size_t kHugePageSize = 2 * 1024 * 1024;
class MMapRegion {
public:
MMapRegion(std::size_t size, int prot, int flags)
: size_(size) {
ptr_ = mmap(nullptr, size_, prot, flags, -1, 0);
if (ptr_ == MAP_FAILED) {
ptr_ = nullptr;
throw std::system_error(errno, std::generic_category(), "mmap failed");
}
}
~MMapRegion() { if (ptr_) munmap(ptr_, size_); }
void* data() const noexcept { return ptr_; }
private:
void* ptr_{nullptr};
std::size_t size_{0};
};
void allocate_hugetlbfs() {
std::cout << "
[Hugetlbfs]
";
try {
MMapRegion region(kHugePageSize, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB);
std::cout << "[+] Successfully allocated 2MiB huge page via hugetlbfs
";
} catch (const std::system_error& e) {
std::cerr << "[-] Hugetlbfs allocation failed: " << e.code().message() << '
';
}
}
void allocate_thp_madvise() {
std::cout << "
[Transparent Huge Pages]
";
try {
MMapRegion region(kHugePageSize, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS);
if (madvise(region.data(), kHugePageSize, MADV_HUGEPAGE) != 0) {
throw std::system_error(errno, std::generic_category(), "madvise(MADV_HUGEPAGE) failed");
}
std::cout << "[+] MADV_HUGEPAGE successfully applied
";
} catch (const std::system_error& e) {
std::cerr << "[-] THP setup failed: " << e.code().message() << '
';
}
}
int main() {
allocate_hugetlbfs();
allocate_thp_madvise();
return 0;
}Finally, the author cautions that huge pages are not a silver bullet; they shine for memory‑intensive, low‑latency workloads that process tens of gigabytes or more, but developers should always verify the impact with tools like perf rather than relying on intuition.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
IT Services Circle
Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
