Operations 24 min read

Master Linux Memory Management: Core Commands & Tuning in 10 Minutes

This comprehensive guide walks you through Linux memory management fundamentals, from prerequisite environments and a quick checklist to step‑by‑step installation of monitoring tools, memory diagnostics, kernel parameter adjustments, THP and swap optimization, NUMA affinity tuning, validation, Prometheus alerts, security hardening, troubleshooting, rollback procedures, best‑practice recommendations, and ready‑to‑use scripts and configuration snippets.

Ops Community
Ops Community
Ops Community
Master Linux Memory Management: Core Commands & Tuning in 10 Minutes

Applicable Scenarios & Prerequisites

Applicable Scenarios : database servers, cache clusters, container platforms, high‑concurrency web applications

Operating Systems : RHEL/CentOS 7.x‑9.x, Ubuntu 18.04‑24.04

Permissions : root or sudo access

Required Tools : sysstat (sar/iostat), procps‑ng (vmstat/free), numactl

Minimum Specs : 2 CPU 4 GB for testing, 8 CPU 16 GB+ recommended for production

Environment & Version Matrix

Component | RHEL/CentOS | Ubuntu/Debian | Minimum Specs --- | --- | --- | --- Kernel version | 3.10+ / 5.14+ | 4.15+ / 5.15+ | – sysstat | 11.7.3 / 12.5.4 | 11.6.1 / 12.5.6 | – numactl | 2.0.12+ | 2.0.12+ | – CPU | – | – | 2 cores (8 cores recommended) Memory | – | – | 4 GB+ (16 GB recommended) Storage | – | – | SSD 50 GB+

Quick Checklist

Install monitoring tools (sysstat, numactl, htop)

Diagnose current memory state (free, vmstat, sar)

Identify memory pressure sources (top, ps, systemd‑cgtop)

Adjust kernel memory parameters (/etc/sysctl.conf)

Configure Transparent Huge Pages (THP)

Optimize Swap strategy (swappiness, zswap)

NUMA affinity tuning (numactl binding)

Validate tuning effects (before/after comparison)

Set up monitoring & alerts (Prometheus metrics)

Rollback & backup mechanisms (configuration version control)

Implementation Steps

Step 1: Install Monitoring Toolset

RHEL/CentOS:

sudo yum install -y sysstat numactl htop procps-ng
sudo systemctl enable sysstat
sudo systemctl start sysstat

Ubuntu/Debian:

sudo apt update
sudo apt install -y sysstat numactl htop procps
sudo systemctl enable sysstat
sudo systemctl start sysstat

Verify installation:

sar -V
numactl --hardware
htop --version

Step 2: Diagnose Current Memory State

Basic memory diagnostics:

# Show memory overview (human‑readable)
free -h
# Continuous monitoring (refresh every 2 seconds)
vmstat 2 5
# Historical memory usage (sample every 10 minutes today)
sar -r 10 1

Key metrics interpretation: available: truly usable memory (including reclaimable cache) buff/cache: file cache, safe to release si/so (vmstat): swap in/out, non‑zero indicates pressure

Step 3: Identify Memory Pressure Sources

Process‑level inspection:

# Top 20 processes by memory usage
ps aux --sort=-%mem | head -20
# Real‑time monitoring sorted by memory
top -o %MEM
# cgroup level monitoring (systemd managed)
systemd-cgtop --depth=3

Memory leak detection:

# Sample every 60 seconds, repeat 10 times
for i in {1..10}; do
  ps aux --sort=-%mem | head -5
  sleep 60
done

Container scenario:

# Docker memory stats
docker stats --no-stream --format "table {{.Name}}\t{{.MemUsage}}\t{{.MemPerc}}"
# K8s pod memory usage
kubectl top pod --all-namespaces --sort-by=memory

Step 4: Adjust Kernel Memory Parameters

Backup original configuration (important!):

sudo cp /etc/sysctl.conf /etc/sysctl.conf.backup.$(date +%F)

Core parameter configuration:

sudo tee -a /etc/sysctl.conf >/dev/null <<EOF
# Memory reclamation
vm.swappiness = 10
vm.vfs_cache_pressure = 50
vm.dirty_ratio = 15
vm.dirty_background_ratio = 5

# OOM Killer tuning
vm.overcommit_memory = 1
vm.overcommit_ratio = 50
vm.panic_on_oom = 0

# Transparent Huge Pages (recommended disabled for databases)
vm.nr_hugepages = 0
EOF

Parameter explanations: swappiness=10: prefer physical memory (default 60) dirty_ratio=15: dirty pages flushed at 15% (reduces I/O spikes) overcommit_memory=1: allow over‑allocation (required for containers)

Apply immediately: sudo sysctl -p Verify configuration:

sysctl vm.swappiness vm.dirty_ratio vm.overcommit_memory

Step 5: Configure Transparent Huge Pages (THP)

Scenario decision:

Databases (MySQL/PostgreSQL/MongoDB): disable THP

General applications / container platforms: keep default

Disable THP (recommended for databases):

# Temporary disable
echo never | sudo tee /sys/kernel/mm/transparent_hugepage/enabled
echo never | sudo tee /sys/kernel/mm/transparent_hugepage/defrag

# Persistent disable via systemd service
sudo tee /etc/systemd/system/disable-thp.service >/dev/null <<'EOF'
[Unit]
Description=Disable Transparent Huge Pages (THP)
DefaultDependencies=no
After=sysinit.target local-fs.target
Before=mysqld.service mongod.service

[Service]
Type=oneshot
ExecStart=/bin/sh -c 'echo never > /sys/kernel/mm/transparent_hugepage/enabled'
ExecStart=/bin/sh -c 'echo never > /sys/kernel/mm/transparent_hugepage/defrag'

[Install]
WantedBy=basic.target
EOF

sudo systemctl daemon-reload
sudo systemctl enable disable-thp.service
sudo systemctl start disable-thp.service

Verify status:

cat /sys/kernel/mm/transparent_hugepage/enabled

Step 6: Optimize Swap Strategy

Scenario strategies:

Physical/VM hosts: swappiness=10 (keep a small swap buffer)

Container nodes: swappiness=0 (disable swap)

High‑memory‑pressure cases: enable zswap compression

Enable zswap (kernel 3.11+):

# Check support
grep -r . /sys/module/zswap/parameters/

# Temporary enable compression
echo 1 | sudo tee /sys/module/zswap/parameters/enabled
echo lz4 | sudo tee /sys/module/zswap/parameters/compressor
echo z3fold | sudo tee /sys/module/zswap/parameters/zpool

# Persist via GRUB
sudo sed -i 's/GRUB_CMDLINE_LINUX="/&zswap.enabled=1 zswap.compressor=lz4 zswap.zpool=z3fold /' /etc/default/grub
sudo grub2-mkconfig -o /boot/grub2/grub.cfg   # RHEL
# sudo update-grub   # Ubuntu

Verify zswap status:

grep -H . /sys/module/zswap/parameters/*

Step 7: NUMA Affinity Tuning

Check NUMA topology:

numactl --hardware
lscpu | grep NUMA

Bind processes to NUMA nodes:

# MySQL bound to node 0
numactl --cpunodebind=0 --membind=0 /usr/sbin/mysqld &
# Docker container binding
docker run --cpuset-cpus="0-3" --cpuset-mems="0" nginx
# K8s node‑level policy
sudo tee /etc/systemd/system/kubelet.service.d/numa.conf >/dev/null <<EOF
[Service]
Environment="KUBELET_EXTRA_ARGS=--topology-manager-policy=single-numa-node"
EOF
sudo systemctl daemon-reload
sudo systemctl restart kubelet

Step 8: Validate Tuning Effects

Run a 5‑minute stress test and compare metrics:

# Example sysbench memory test
sysbench memory --memory-total-size=10G --memory-oper=write run
# Compare swap usage before and after
echo "=== Before ==="
free -h | grep Swap
vmstat 1 5 | awk '{print $7,$8}'   # si/so columns

echo "=== After ==="
# Run the benchmark again and repeat the sampling

Key validation points: available memory > 10% of total si/so (swap in/out) close to 0 %wa (iowait) < 10%

Monitoring & Alerts (Ready to Use)

Prometheus Metrics

# High memory pressure alert (available < 10%)
- alert: HighMemoryPressure
  expr: (1 - node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) > 0.90
  for: 5m
  labels:
    severity: warning

# Swap usage alert
- alert: HighSwapActivity
  expr: rate(node_vmstat_pswpin[5m]) + rate(node_vmstat_pswpout[5m]) > 100
  for: 10m
  labels:
    severity: critical

# OOM Killer trigger alert
- alert: OOMKillDetected
  expr: increase(node_vmstat_oom_kill[10m]) > 0
  for: 10m
  labels:
    severity: critical

Linux Native Monitoring Script

#!/bin/bash
THRESHOLD=90
while true; do
  MEM_USED=$(free | awk '/Mem/{printf("%.0f", $3/$2*100)}')
  if [ "$MEM_USED" -gt "$THRESHOLD" ]; then
    echo "[ALERT] Memory usage: ${MEM_USED}% at $(date)"
    ps aux --sort=-%mem | head -6 >> /var/log/mem-alert.log
  fi
  sleep 60
done

Performance & Capacity (Reproducible)

Benchmark

Memory bandwidth test (sysbench):

# Install sysbench
sudo yum install sysbench -y   # RHEL
sudo apt install sysbench -y   # Ubuntu

# Sequential write test
sysbench memory \
  --memory-total-size=100G \
  --memory-oper=write \
  --memory-access-mode=seq \
  run

# Random read test
sysbench memory \
  --memory-total-size=50G \
  --memory-oper=read \
  --memory-access-mode=rnd \
  --threads=8 \
  run

Target metrics:

Sequential write bandwidth > 10 GB/s (DDR4 dual‑channel)

Random read latency < 100 ns

Memory utilization > 80% (avoid waste)

Parameter Quick Reference

Database (MySQL) example:

# /etc/my.cnf
[mysqld]
innodb_buffer_pool_size = 12G   # 70‑80% of physical memory
innodb_buffer_pool_instances = 8   # number of CPU cores
innodb_page_cleaners = 4   # dirty page flush threads

Cache (Redis) example:

# /etc/redis/redis.conf
maxmemory 8gb
maxmemory-policy allkeys-lru

Container (Docker) example:

# Limit container memory and swap
docker run -m 2g --memory-swap 2g \
  --memory-reservation 1.5g \
  --oom-kill-disable=false \
  nginx

Security & Compliance (Minimum Required)

Permissions & Auditing

# Only root can modify sysctl configuration
sudo chmod 644 /etc/sysctl.conf
sudo chown root:root /etc/sysctl.conf

# Audit changes to sysctl
sudo auditctl -w /etc/sysctl.conf -p wa -k sysctl_changes

Sensitive Data Protection

# Disable core dumps at process level
ulimit -c 0

# System‑wide limit
echo "* hard core 0" | sudo tee -a /etc/security/limits.conf

Common Issues & Troubleshooting

Symptom: System slow, high load

Diagnostic commands: vmstat 1 5, sar -r Possible cause: Memory exhaustion, frequent swap

Quick fix: sync; echo 3 > /proc/sys/vm/drop_caches Permanent fix: Add physical memory or optimize applications

Symptom: OOM Killer frequently triggered

Diagnostic command: dmesg | grep -i oom Possible cause: Over‑commit or memory leak

Quick fix: systemctl restart <service> Permanent fix: Fix leak, adjust cgroup limits

Symptom: Swap usage continuously rises

Diagnostic commands: swapon -s, vmstat 1 Possible cause: Swappiness too high

Quick fix: sysctl vm.swappiness=10 Permanent fix: Add vm.swappiness=10 to /etc/sysctl.conf

Symptom: NUMA imbalance

Diagnostic command: numastat -p <PID> Possible cause: Process not bound to NUMA node

Quick fix: numactl --membind=0 <cmd> Permanent fix: Add numactl binding to service start scripts

Symptom: Transparent Huge Page fragmentation

Diagnostic command: cat /proc/buddyinfo Possible cause: THP enabled in database workloads

Quick fix: echo never > .../enabled Permanent fix: Deploy systemd service to disable THP persistently

Symptom: Buffer/Cache usage too high

Diagnostic commands: free -h, slabtop Possible cause: Normal behavior (cache can be reclaimed)

Quick fix: No action needed

Permanent fix: Monitor the available metric

Change & Rollback Playbook

Change Window

Gray‑scale strategy: single node → 10% nodes → full rollout

Health checks: run sar -r 1 60 for 5 minutes at each stage

Rollback condition: available memory < 5% or OOM event occurs

Rollback Commands

# Restore sysctl configuration
sudo cp /etc/sysctl.conf.backup.$(date +%F) /etc/sysctl.conf
sudo sysctl -p

# Restart affected services (example: MySQL)
sudo systemctl restart mysqld

# Restore THP default
echo always | sudo tee /sys/kernel/mm/transparent_hugepage/enabled

Backup Checklist

# System configuration backup
sudo tar czf /backup/sysctl-$(date +%F).tar.gz /etc/sysctl.conf /etc/sysctl.d/

# Service configuration (MySQL example)
sudo cp /etc/my.cnf /backup/my.cnf.$(date +%F)

# Current memory state snapshot
free -h > /backup/memory-baseline-$(date +%F).txt
numactl --hardware > /backup/numa-topology-$(date +%F).txt

Best Practices (10 Key Points)

Memory reservation : keep 15‑20% free in production for traffic spikes

Swap configuration : for physical memory ≥ 64 GB, set swap to a fixed 8‑16 GB

THP decision : disable for databases, keep default (madvise) for general apps

Monitoring granularity : 1‑minute sampling (sar) for general, 10‑second for critical workloads

Over‑commit limits : in container nodes, memory.limit_in_bytes ≤ 80% of node memory

NUMA binding : bind processes when a single instance uses > 50% of node memory

OOM priority : set oom_score_adj=-1000 for critical services to avoid kill

Cache strategy : set database buffer pool to 70‑80% of physical memory

Stress testing : run stress-ng --vm 8 --vm-bytes 80% before release

Change approval : kernel parameter changes require 72 hours of stability testing in a pre‑production environment

Appendix (Sample Assets)

Full sysctl configuration (production‑grade)

# /etc/sysctl.d/99-memory-tuning.conf
# Memory management
vm.swappiness = 10
vm.vfs_cache_pressure = 50
vm.dirty_ratio = 15
vm.dirty_background_ratio = 5
vm.dirty_writeback_centisecs = 500
vm.dirty_expire_centisecs = 3000

# Over‑commit
vm.overcommit_memory = 1
vm.overcommit_ratio = 50
vm.panic_on_oom = 0
vm.oom_kill_allocating_task = 0

# NUMA auto‑balancing (disable for high load)
kernel.numa_balancing = 0

# Transparent Huge Pages
vm.nr_hugepages = 0

Systemd service template (memory limits)

# /etc/systemd/system/myapp.service
[Unit]
Description=My Application
After=network.target

[Service]
Type=simple
User=appuser
ExecStart=/usr/local/bin/myapp
Restart=on-failure

# Memory limits and protection
MemoryMax=8G
MemoryHigh=6G
MemorySwapMax=0
OOMPolicy=continue

[Install]
WantedBy=multi-user.target

Prometheus alert rules (complete)

# memory-alerts.yaml
groups:
- name: memory
  interval: 30s
  rules:
  - alert: HighMemoryUsage
    expr: (1 - node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) > 0.85
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "Memory usage > 85%"
      description: "Node {{ $labels.instance }} memory usage {{ $value | humanizePercentage }}"
  - alert: SwapUsageHigh
    expr: (1 - node_memory_SwapFree_bytes / node_memory_SwapTotal_bytes) > 0.50
    for: 10m
    labels:
      severity: warning
  - alert: MemoryPressure
    expr: rate(node_vmstat_pgmajfault[5m]) > 1000
    for: 5m
    labels:
      severity: critical
    annotations:
      summary: "Major page fault rate too high"
      description: "Node {{ $labels.instance }} experiencing high major faults"

Ansible playbook (memory tuning)

# memory-tuning.yml
- name: Optimize Linux memory parameters
  hosts: all
  become: yes
  tasks:
    - name: Backup original sysctl.conf
      copy:
        src: /etc/sysctl.conf
        dest: "/etc/sysctl.conf.backup.{{ ansible_date_time.date }}"
        remote_src: yes

    - name: Apply memory tuning parameters
      sysctl:
        name: "{{ item.key }}"
        value: "{{ item.value }}"
        state: present
        reload: yes
      loop:
        - { key: 'vm.swappiness', value: '10' }
        - { key: 'vm.dirty_ratio', value: '15' }
        - { key: 'vm.overcommit_memory', value: '1' }

    - name: Disable Transparent Huge Pages permanently
      lineinfile:
        path: /etc/rc.local
        line: 'echo never > /sys/kernel/mm/transparent_hugepage/enabled'
        create: yes
      notify: Restart system services

  handlers:
    - name: Restart system services
      command: /bin/true
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

memory-managementPerformance TuningLinuxNUMASwap
Ops Community
Written by

Ops Community

A leading IT operations community where professionals share and grow together.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.