Overview of NFSv4 and the NFS‑Ganesha Architecture
This article provides a comprehensive overview of NFSv4’s design goals, security and performance improvements, and details the four major advantages of NFS‑Ganesha, its modular architecture, memory and thread management, caching mechanisms, and a practical Ceph RGW integration example.
Brief Overview of NFSv4
NFS was originally designed by Sun Microsystems in 1984 (NFSv2) and later evolved to NFSv3 and NFSv4, the latter being driven by the IETF with goals of improving inter‑network access and performance, providing security, enhancing cross‑platform operation, and facilitating future extensions.
NFSv4 introduces a strong stateful mechanism, abandoning the stateless design of earlier versions, which enables better load balancing and reduces client/server round‑trip time (RTO). It also switches from UDP to TCP, enforces RPCSEC_GSS for security, and adds extensibility through minor versions such as NFSv4.1, which supports RDMA, pNFS, and directory delegation.
Four Major Advantages of NFS‑Ganesha
Developed around 2007 to address limitations of HSM‑based NFS bridges, NFS‑Ganesha offers:
Management of million‑scale data caches to avoid underlying filesystem bottlenecks.
Compatibility with HPSS and other filesystems.
Support for NFSv4 with adaptability, extensibility, and security.
Resolution of software‑induced performance bottlenecks.
Open‑source licensing and Unix compatibility.
Although a user‑space implementation may have lower raw performance than kernel NFS, it provides richer functionality.
1. Flexible Memory Allocation
Running in user space allows allocation of large memory pools (e.g., 4 GB for million‑scale caches, up to 32 GB on x86_64) to hold internal caches.
2. Strong Portability
Being user‑space, Ganesha can be compiled for multiple operating systems and can adapt to various filesystems, unlike kernel‑only solutions.
3. Convenient Access Mechanism
Ganesha avoids the complex rpc_pipefs bridge required by kernel NFSv4, using regular APIs for service exposure.
4. FUSE Integration
It can directly mount FUSE filesystems on NFS, eliminating the need for kernel assistance.
NFS‑Ganesha Framework Overview
The architecture is modular; each module (e.g., Memory Manager, RPCSEC_GSS, NFS protocol module, Metadata Cache, File Content Cache, FSAL, Hash Tables) handles a specific responsibility, reducing inter‑module coupling and simplifying independent development and testing.
Memory Management
Ganesha employs a custom Buddy allocator with madvise to reserve large memory blocks, avoiding fragmentation and swap‑induced performance loss.
Thread Management
Numerous POSIX threads handle parallel requests. The design uses read‑write locks, a dispatcher thread, worker threads, statistics threads, and an admin gateway, with careful handling of deadlocks and garbage collection to prevent bottlenecks.
Hash Tables
Red‑black tree based hash tables provide efficient associative look‑ups; a tree‑array design reduces contention when multiple threads update the structures.
Cache Handling
Metadata and file‑content caches are tightly coupled; LRU lists per thread manage eviction, and a write‑through strategy with optional write‑back for large files ensures consistency. Cache instances are reclaimed only after thorough checks to avoid premature deletion.
FSAL (File System Abstraction Layer)
FSAL offers a generic interface for both inode and content caches, allowing plug‑in modules (e.g., FSAL_RGW) to adapt Ganesha to different storage backends.
NFS‑Ganesha Integration with Ceph RGW
The article illustrates the request flow for an open() call: the system call traverses VFS to NFS, which forwards the request to Ganesha via RPC. The dispatcher reads the request, enqueues it, and a worker thread processes it, consulting the FSAL (configured to use Ceph RGW) which ultimately invokes rgw_fsal_open2() via librgw . Responses are cached in a hash table for duplicate detection and sent back to the client.
Source: https://zhuanlan.zhihu.com/p/34833897
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.