How HugePages Boost Database and Hadoop Performance on Linux
This article explains Linux HugePages, how to view and configure them, demonstrates code and Kubernetes examples, and details how larger memory pages reduce management overhead and lock memory to improve performance for memory‑intensive services like databases and Hadoop.
Memory is a critical resource, and while many services have modest needs, databases and the Hadoop ecosystem consume gigabytes to terabytes of RAM to accelerate computation. Linux introduces HugePages (large pages) to manage such memory more efficiently.
Most CPU architectures support larger pages, known as HugePages on Linux, SuperPages on BSD, and LargePages on Windows. Linux’s default page size is 4 KB, but HugePages typically use 2 MiB (and on some architectures up to 1 GiB), which is 262,144 times larger than the default.
To inspect HugePages on a Linux machine, run:
$ cat /proc/meminfo | grep Huge
AnonHugePages: 71680 kB
ShmemHugePages: 0 kB
FileHugePages: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Hugetlb: 0 kBThe /proc/sys/vm/nr_hugepages file holds the number of pre‑allocated huge pages. By default it is 0, but you can allocate pages by writing to it, e.g.:
$ echo 1 > /proc/sys/vm/nr_hugepages
$ cat /proc/meminfo | grep HugePages_
HugePages_Total: 1
HugePages_Free: 1
...Applications can request huge pages via the mmap system call with the MAP_HUGETLB flag and release them with munmap:
size_t s = (2U * 1024 * 1024);
char *m = mmap(NULL, s, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB,
-1, 0);
munmap(m, s);In container orchestration, Kubernetes treats huge pages as a distinct resource. A pod can request them as follows:
apiVersion: v1
kind: Pod
metadata:
name: huge-pages-example
spec:
containers:
- name: example
...
volumeMounts:
- mountPath: /hugepages-2Mi
name: hugepage-2mi
- mountPath: /hugepages-1Gi
name: hugepage-1gi
resources:
limits:
hugepages-2Mi: 100Mi
hugepages-1Gi: 2Gi
memory: 100Mi
requests:
memory: 100Mi
volumes:
- name: hugepage-2mi
emptyDir:
medium: HugePages-2Mi
- name: hugepage-1gi
emptyDir:
medium: HugePages-1GiHugePages improve performance for memory‑intensive workloads in two main ways:
They reduce memory‑management overhead: larger pages mean fewer page‑table levels, lower page‑table memory usage, higher TLB hit rates, and dramatically fewer page‑walk operations (e.g., accessing 1 GiB with 4 KB pages requires 262,144 walks versus only 512 walks with 2 MiB pages).
They lock memory, preventing the OS from swapping it out. Because huge pages are pre‑allocated and locked in RAM, they remain resident even under memory pressure, eliminating costly swap‑in/out cycles.
Transparent Huge Pages (THP) provide automatic huge‑page handling but are generally not recommended for databases and similar workloads due to potential latency spikes.
In summary, enabling HugePages can lower page‑table management costs, increase TLB efficiency, and keep critical memory resident, thereby boosting the performance of databases, Hadoop, and other RAM‑heavy services, though they add configuration complexity and are less beneficial for typical web or backend services.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
