Databases 14 min read

How Huge Linux Pages Can Boost Database Throughput on Kubernetes by Up to 8×

This article explains how Linux page size—from the default 4 KB to 2 MB or 1 GB huge pages—affects database performance, details the role of TLB cache hits and misses, presents benchmark results showing up to an eight‑fold throughput increase, and offers practical guidance for configuring huge pages on Kubernetes nodes.

Liangxu Linux

Oct 22, 2023

How Huge Linux Pages Can Boost Database Throughput on Kubernetes by Up to 8×

Linux page sizes and TLB impact

On x86‑64 Linux three page sizes are available: 4 KB (default), 2 MB and 1 GB. Small pages minimise internal fragmentation for tiny allocations, while large pages reduce the total number of page‑table entries and therefore the number of Translation Lookaside Buffer (TLB) entries required to map a memory region. Each memory access must translate a virtual address to a physical address; the CPU caches recent translations in the TLB. A TLB hit is a single‑cycle hardware operation, whereas a miss triggers a page‑table walk in the kernel (efficient C code but still orders of magnitude slower). Databases perform millions of memory accesses, so TLB miss rates directly affect read/write latency, especially for wide rows that span many 4 KB pages.

Benchmark methodology

Benchmarks were executed on an AMD EPYC 7J1C3 @ 2.55 GHz processor. One hundred million rows were pre‑loaded into DRAM and accessed via IPC (no TCP). Three row sizes were tested:

128 B (fits in a single 4 KB page)

8 KB (spans two 4 KB pages)

16 KB (spans four 4 KB pages)

For each row size the workload was run with three Linux page‑size configurations: 4 KB, 2 MB and 1 GB. The database client used 128 concurrent connections and the server was single‑node, ensuring the entire dataset resided in RAM.

Results

Using 2 MB pages instead of 4 KB increased throughput dramatically:

128 B rows – up to 8× higher

8 KB rows – up to 8× higher

16 KB rows – up to 5× higher

Switching from 2 MB to 1 GB pages yielded a modest additional gain of 1 %–21 % depending on row width (all rows still fit within a single 2 MB page, so the benefit comes from reduced TLB miss probability).

CPU TLB characteristics

Typical entry counts for modern CPUs:

Intel Ice Lake

4 KB L1 TLB – 64 entries

2 MB L1 TLB – 32 entries

1 GB L1 TLB – 8 entries

L2 TLB (4 KB + 2 MB) – 1 024 entries

L2 TLB (4 KB + 1 GB) – 1 024 entries

AMD EPYC Zen 3

L1 TLB (4 KB + 2 MB + 1 GB) – 64 entries total

L2 TLB (4 KB + 2 MB) – 512 entries

Because the L1 TLB holds only a few dozen 4 KB entries, workloads with wide rows or high concurrency quickly exhaust the cache, causing frequent misses. Switching to 2 MB pages effectively expands the address range covered by each TLB entry, dramatically reducing miss rates.

Optimizing Kubernetes nodes for databases

Kubernetes itself does not manage huge pages; they must be configured on the host OS before pods start.

Disable Transparent Huge Pages (THP) to avoid unpredictable memory usage:

echo never > /sys/kernel/mm/transparent_hugepage/enabled

Allocate the desired number of huge pages. For 2 MB pages on a node with 256 GB RAM, for example:

# echo 131072 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages

Adjust the count to leave enough memory for other workloads.

If 1 GB pages are required, add a kernel boot parameter (e.g., default_hugepagesz=1G hugepagesz=1G hugepages=64) and reboot.

Expose the huge‑page resources to pods via the resources.limits and resources.requests fields, e.g.:

resources:
  limits:
    hugepages-2Mi: "4Gi"
  requests:
    hugepages-2Mi: "4Gi"

Label and taint nodes that are provisioned with huge pages (e.g., node-role.kubernetes.io/db=true) and use pod node selectors or affinity rules so that the database pod lands on a suitable node.

What can and cannot be controlled

Cannot control : row/record width, total row count, database work‑set size, query concurrency, CPU TLB size.

Can control : Linux kernel page size on each node, number of huge pages allocated, pod memory requests/limits, node labeling/tainting to ensure placement on a node with the appropriate huge‑page configuration.

Practical recommendations

For most OLTP workloads, configure 2 MB huge pages on dedicated database nodes; this yields up to an 8× throughput increase for narrow and medium rows and a 5× increase for wider rows.

Consider 1 GB pages only if the workload consistently accesses rows larger than 2 MB or if the node has abundant RAM; the gain over 2 MB pages is modest (1 %–21 %).

Always disable THP to prevent memory waste and unpredictable latency.

Allocate enough huge pages to cover the database’s active working set while leaving headroom for other system components.

Use node labels/taints and pod affinity to schedule database pods onto nodes that have been prepared with the required huge‑page configuration.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

kubernetes Database Performance TLB HugePages

Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.