Practical Guide to Diagnosing and Fixing NFS Mount Failures
This guide explains the NFS protocol, common mount failures, five root‑cause categories, step‑by‑step installation, configuration, verification, detailed error analysis, real‑world case studies, performance tuning, automation scripts, best‑practice recommendations and monitoring techniques for reliable NFS deployments on Ubuntu 24.04 and Rocky Linux 9.5.
Overview
NFS (Network File System) is the most widely used network file‑sharing protocol in UNIX/Linux environments. It provides transparent remote file access by exposing standard POSIX file‑system interfaces.
NFS Versions
NFSv3 (RFC 3530, 1995) : Stateless, uses multiple ports (mountd, nlockmgr, statd) and portmapper (rpcbind). Supports async writes and 64‑byte file handles. Still dominant for latency‑sensitive HPC workloads.
NFSv4 (RFC 7530, 2003) : Stateful, introduces lease and delegation, consolidates all operations over TCP 2049, simplifying firewall rules. Adds compound operations, ACLs, and Kerberos authentication.
NFSv4.1 (RFC 8881, 2010) : Adds session concept and pNFS for parallel I/O.
NFSv4.2 (RFC 7862, 2016) : Server‑side copy, sparse file support, IO_ADVISE, SELinux‑labelled NFS. Supported from Linux kernel 4.0+.
Mount Process (NFSv3)
1. Client queries rpcbind (port 111) for mountd port.
2. Client sends MOUNT request to mountd with export path.
3. Server checks /etc/exports and client IP.
4. Server returns a file handle.
5. Client uses the handle on NFS port (2049) for further ops.
6. File locks are managed by nlockmgr (separate port).
7. After server reboot, lock recovery is coordinated by statd.Mount Process (NFSv4)
1. Client connects directly to server TCP 2049.
2. Sends SETCLIENTID/EXCHANGE_ID to establish identity.
3. Sends CREATE_SESSION to start a session.
4. Sends PUTROOTFH + LOOKUP to locate the export.
5. Server checks export permissions and authentication.
6. Client receives a file handle and can perform normal file ops.
7. Client periodically renews the lease (RENEW/SEQUENCE).Common Failure Categories
Network unreachable – RPC ports (111, 2049, dynamic ports) blocked by firewalls or routing issues.
Server configuration errors – wrong /etc/exports, services not started, missing export path or wrong permissions.
Authentication and permission problems – root_squash, UID/GID mismatches, Kerberos mis‑configuration, NFSv4 idmapping issues.
Version or parameter incompatibility – client requests an unsupported NFS version, inappropriate rsize/wsize, missing kernel modules.
Runtime faults – stale file handle, server not responding (process stuck in D state), intermittent failures caused by network jitter.
Core Concepts
RPC & Port Mapping : NFSv3 relies on rpcbind to discover dynamic service ports. Fixing these ports avoids firewall complications.
File Handle : NFS identifies remote files by an opaque binary handle generated by the server. If the underlying inode changes (e.g., after a storage migration), the handle becomes stale and the client sees ESTALE.
root_squash : By default, NFS maps the client’s root UID 0 to the anonymous UID 65534 (nobody). This prevents privileged writes from remote roots but can cause permission denials for backup scripts.
Applicable Scenarios
Mount command returns an error and the share cannot be mounted.
Already‑mounted NFS share suddenly becomes inaccessible, I/O hangs.
Write operations fail with “Permission denied” despite correct local permissions.
Mount command hangs for a long time and finally times out.
Application sees “Stale file handle” errors after server storage migration.
Read/write performance far below expectations.
Mount fails after server reboot.
Kubernetes Pods cannot start when NFS is used as a Persistent Volume backend.
Environment Requirements
Operating System: Ubuntu 24.04 LTS or Rocky Linux 9.5.
Linux Kernel: 6.12+ (contains latest NFS client/server improvements).
Packages: nfs-common (client) / nfs-kernel-server (Ubuntu) or nfs-utils (Rocky) plus rpcbind, tcpdump, wireshark-cli, prometheus-node-exporter (>=1.9) for monitoring.
Detailed Procedure
Preparation
Install NFS Tools
# Ubuntu client
sudo apt update
sudo apt install -y nfs-common
# Rocky Linux client
sudo dnf install -y nfs-utils
sudo systemctl enable --now rpcbindInstall NFS Server Packages
# Ubuntu server
sudo apt install -y nfs-kernel-server
sudo systemctl enable --now nfs-kernel-server
# Rocky Linux server
sudo dnf install -y nfs-utils
sudo systemctl enable --now nfs-serverServer Configuration
Export Definition
/data/shared 10.0.0.0/24(rw,sync,no_subtree_check,no_root_squash)
/data/readonly *(ro,sync,no_subtree_check)
/home 192.168.1.0/24(rw,sync,no_subtree_check,root_squash)After editing, reload with sudo exportfs -ra and verify with sudo exportfs -v.
Fix Dynamic Ports (NFSv3)
[mountd]
port=20048
[statd]
port=32765
[lockd]
port=32803
udp-port=32803Adjust nfsd Thread Count
Default is 8 threads. For high concurrency, increase via RPCNFSDCOUNT=32 in /etc/default/nfs-kernel-server or threads=32 in /etc/nfs.conf, then restart the service.
Client Configuration
Verify Kernel Modules
lsmod | grep nfs
# Load if missing
sudo modprobe nfs
sudo modprobe nfsd # only on serverCheck Export Availability
# Show what the server exports
showmount -e nfs-server.example.com
# List RPC services
rpcinfo -p nfs-server.example.comMount Commands and Error Mapping
# Basic mount
sudo mount -t nfs nfs-server:/data/shared /mnt/nfs
# Specify version
sudo mount -t nfs -o vers=4.2 nfs-server:/data/shared /mnt/nfs
# Verbose mount for troubleshooting
sudo mount -t nfs -o vers=4.2 -v nfs-server:/data/shared /mnt/nfsCommon error messages and their causes: mount.nfs: Connection timed out – network or firewall block (usually port 111 for NFSv3). mount.nfs: access denied by server – export does not allow the client IP. mount.nfs: No such file or directory – export path missing on server or mount point missing locally. mount.nfs: Operation not permitted – SELinux blocks the operation. mount.nfs: Protocol not supported – client requests an unsupported NFS version. mount.nfs: Stale file handle – server‑side inode changed; unmount and remount.
Permission and Authentication Checks
UID/GID Mapping (NFSv3)
NFSv3 uses numeric UID/GID for permission control. The same UID/GID must exist on both client and server, otherwise permission mismatches occur.
root_squash
Check with sudo exportfs -v. To allow root writes, either remove root_squash (riskier) or use all_squash,anonuid=1000,anongid=1000 to map all users to a dedicated UID.
NFSv4 idmapping
Domain must match on client and server (see /etc/idmapd.conf). Mismatched domains cause all files to appear as nobody:nogroup.
Kerberos
When sec=krb5 is used, verify rpc.gssd is running, the keytab exists, and a valid ticket can be obtained with kinit -k -t /etc/krb5.keytab nfs/hostname@REALM.
Performance Tuning
Key mount options for high throughput:
# General high‑performance example
sudo mount -t nfs -o vers=4.2,rsize=1048576,wsize=1048576,hard,noatime,proto=tcp nfs-server:/data/shared /mnt/nfs rsize/wsize– larger values (up to 1 MiB) improve large‑file throughput. nconnect=N (Linux 5.3+) – opens multiple TCP connections (2‑16) to overcome single‑connection bandwidth limits. hard vs soft – hard guarantees data integrity; soft may cause data loss but prevents processes from hanging. noatime – reduces metadata writes.
Automation Scripts
NFS Health‑Check Script (bash)
#!/bin/bash
MOUNT_POINTS="/mnt/nfs-shared /mnt/nfs-data /mnt/nfs-backup"
CHECK_TIMEOUT=10
WEBHOOK_URL="https://example.com/webhook"
MAX_RECOVERY_ATTEMPTS=3
RECOVERY_STATE_DIR="/tmp/nfs_recovery_state"
TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S')
send_alert(){
local msg="$1"
curl -s -X POST "$WEBHOOK_URL" -H "Content-Type: application/json" \
-d "{\"msgtype\":\"text\",\"text\":{\"content\":\"[NFS Alert] $TIMESTAMP
$msg\"}}" >/dev/null 2>&1
}
check_mount_exists(){ findmnt -t nfs,nfs4 "$1" >/dev/null 2>&1; }
check_mount_responsive(){ timeout $CHECK_TIMEOUT stat "$1" >/dev/null 2>&1; }
check_mount_writable(){
local test_file="$1/.nfs_check_$$"
timeout $CHECK_TIMEOUT bash -c "echo test > '$test_file' && rm -f '$test_file'" >/dev/null 2>&1
}
for mp in $MOUNT_POINTS; do
if ! check_mount_exists "$mp"; then
echo "[$TIMESTAMP] WARN: $mp not mounted, trying..."
sudo mount "$mp" 2>/dev/null && send_alert "$mp mounted successfully"
continue
fi
if ! check_mount_responsive "$mp"; then
echo "[$TIMESTAMP] ERROR: $mp unresponsive"
send_alert "$mp unresponsive"
continue
fi
if mount | grep "$mp" | grep -q rw && ! check_mount_writable "$mp"; then
echo "[$TIMESTAMP] WARN: $mp read‑only"
send_alert "$mp read‑only"
fi
echo "[$TIMESTAMP] OK: $mp healthy"
# reset recovery counters here if needed
doneConfiguration Backup Script
#!/bin/bash
BACKUP_DIR="/backup/nfs/$(date +%Y%m%d_%H%M%S)"
mkdir -p "$BACKUP_DIR"
# Server config
if systemctl is-active --quiet nfs-kernel-server || systemctl is-active --quiet nfs-server; then
cp /etc/exports "$BACKUP_DIR/" 2>/dev/null
cp /etc/nfs.conf "$BACKUP_DIR/" 2>/dev/null
cp -r /etc/exports.d "$BACKUP_DIR/" 2>/dev/null
sudo exportfs -v > "$BACKUP_DIR/active_exports.txt"
fi
# Client config
grep -E "nfs|nfs4" /etc/fstab > "$BACKUP_DIR/fstab_nfs.txt" 2>/dev/null
mount -t nfs,nfs4 > "$BACKUP_DIR/current_mounts.txt" 2>/dev/null
cp /etc/idmapd.conf "$BACKUP_DIR/" 2>/dev/null
find /etc/systemd/system/ -name "*.mount" -o -name "*.automount" -exec cp {} "$BACKUP_DIR/" \; 2>/dev/null
cp /etc/auto.master "$BACKUP_DIR/" 2>/dev/null
cp /etc/auto.nfs "$BACKUP_DIR/" 2>/dev/null
tar czf "${BACKUP_DIR}.tar.gz" -C "$(dirname $BACKUP_DIR)" "$(basename $BACKUP_DIR)"
rm -rf "$BACKUP_DIR"
echo "NFS configuration backed up to ${BACKUP_DIR}.tar.gz"Best Practices and Caveats
Export only to specific IP ranges; avoid * with write permissions.
Prefer sync on the server for data safety; async risks data loss on crash.
Disable subtree_check to improve performance.
Scale nfsd threads according to concurrent clients (e.g., threads = client_count * 2).
Limit supported NFS versions to those needed; disable older versions to reduce attack surface.
Use hard mounts in production and rely on monitoring to detect server outages.
Tune rsize/wsize and nconnect for large‑file workloads.
Monitoring and Alerting
Collect NFS metrics with node_exporter (enable --collector.nfs and --collector.nfsd) and expose them to Prometheus. node_nfs_rpc_retransmissions_total – RPC retransmission count (high values indicate network issues). node_nfsd_server_threads – number of active nfsd threads (monitor utilization). node_nfs_requests_total – total client requests.
Use nfsstat -c and cat /proc/self/mountstats for on‑host diagnostics.
Example Prometheus alert for high retransmission rate:
groups:
- name: nfs_alerts
rules:
- alert: NFSHighRetransmissions
expr: rate(node_nfs_rpc_retransmissions_total[5m]) / rate(node_nfs_rpc_operations_total[5m]) > 0.01
for: 5m
labels:
severity: warning
annotations:
summary: "NFS RPC retransmission rate high"
description: "{{ $labels.instance }} NFS retransmission rate {{ $value | humanizePercentage }} – possible network problem."Recovery Workflow for Stale File Handles
# Attempt graceful unmount
sudo umount /mnt/nfs
# If busy, find processes
sudo lsof +D /mnt/nfs
# Force unmount if needed
sudo umount -f /mnt/nfs
# Lazy unmount as last resort
sudo umount -l /mnt/nfs
# Remount
sudo mount -t nfs -o vers=4.2 nfs-server:/data/shared /mnt/nfsTechnical Takeaways
NFSv4 simplifies firewall configuration by using only TCP 2049.
Diagnose mount failures step‑by‑step: client config → network reachability → server export and service status.
root_squash is essential for security; use all_squash with explicit anonuid/anongid when root access is required.
Stale file handles stem from inode changes; always unexport before migrating storage.
Use hard mounts for data integrity and pair them with robust monitoring.
Performance hinges on rsize/wsize and nconnect for parallel I/O.
Further Learning
Parallel NFS (pNFS) – NFSv4.1 feature for distributed data layout.
NFS‑Ganesha – userspace NFS server supporting CephFS, GlusterFS, etc.
NFS over RDMA – ultra‑low latency file access on InfiniBand or RoCE networks (Linux 6.x support).
References
Linux NFS Wiki: https://linux-nfs.org/wiki/index.php/Main_Page
nfs‑utils source: https://git.linux-nfs.org/?p=steved/nfs-utils.git
RFC 7530 – NFS Version 4 Protocol
RFC 8881 – NFS Version 4.1 Protocol
RFC 7862 – NFS Version 4.2 Protocol
Red Hat NFS Administration Guide
man 5 exports, man 5 nfs, man 8 mount.nfs
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Raymond Ops
Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
