Big Data 21 min read

Distributed File Systems: Overview, Design Requirements, Architecture Models, and Key Considerations

This article provides a comprehensive overview of distributed file systems, covering their historical evolution, essential design requirements, centralized and decentralized architecture models, persistence, scalability, high availability, performance optimization, security, and additional practical aspects such as space allocation, file deletion, small‑file handling, and deduplication.

Big Data Technology & Architecture
Big Data Technology & Architecture
Big Data Technology & Architecture
Distributed File Systems: Overview, Design Requirements, Architecture Models, and Key Considerations

Overview

Distributed file systems are a fundamental application in the distributed domain, with HDFS and GFS being the most well‑known examples; understanding their design principles offers valuable insights for similar scenarios.

Beyond HDFS/GFS, many other products exist, each with distinct characteristics, expanding our perspective.

The article analyzes the problems to solve, available solutions, and criteria for choosing among them.

Past

In the 1980s, systems like Sun's Network File System (NFS) separated disks from hosts, enabling larger capacity, host switching, data sharing, backup, and disaster recovery.

With the rise of the Internet, massive data growth required horizontal scaling, fault tolerance, high availability, persistence, and elasticity.

Requirements

Conform to POSIX file interface standards.

Transparent to users, behaving like a local file system.

Persistence to prevent data loss.

Scalability to accommodate growing data pressure.

Robust security mechanisms.

Data consistency across reads.

Additional desirable features include large storage capacity, high concurrency, high performance, and efficient hardware utilization.

Architecture Models

Two main routes exist: centralized and decentralized.

1. Centralized (e.g., GFS)

The master node handles file location, metadata, fault detection, and data migration. Clients query the master for chunk locations, then communicate directly with chunk servers for data transfer.

Master nodes typically do not participate in data reads/writes, reducing bottlenecks.

2. Decentralized (e.g., Ceph)

All nodes are autonomous; the cluster consists of a single node type where each node stores both metadata and data (RADOS). Ceph uses the CRUSH algorithm to map files to nodes without a central coordinator.

Persistence

Data is persisted via multiple replicas. Challenges include ensuring replica consistency, dispersing replicas to avoid correlated failures, detecting corrupted or stale replicas, and selecting the appropriate replica for client reads.

Synchronous writes guarantee consistency but increase latency.

Parallel and chain writes improve performance.

W+R>N quorum writes trade read cost for lower write latency.

Scalability

1. Storage node scaling

Balance load across nodes using metrics such as disk usage, CPU, and network traffic.

Prefer nodes with lower utilization when allocating new space.

Perform data migration when nodes become overloaded.

Introduce new nodes gradually (pre‑heat) to avoid sudden load spikes.

2. Central node scaling

Use larger data blocks (e.g., 64 MiB in HDFS) to reduce metadata volume.

Adopt multi‑level metadata hierarchies.

Deploy stateless master nodes sharing a common storage backend (e.g., iRODS).

High Availability

Both master and storage nodes require HA. Master HA can be achieved via active‑passive replication or shared storage; storage HA is inherently provided by replica mechanisms discussed in persistence.

Persist metadata in databases or log‑based storage with periodic snapshots.

Performance Optimization & Cache Consistency

Cache file contents in memory.

Prefetch data blocks.

Batch read/write requests.

Cache introduces consistency challenges such as write‑lost updates and stale reads, mitigated by read‑only policies or locking mechanisms with appropriate granularity.

Security

Distributed file systems serve multiple tenants and must enforce robust access control. Common models include:

DAC (Unix‑style user/group/privilege).

MAC (e.g., SELinux).

RBAC (role‑based).

Systems like Ceph and Hadoop integrate these models, sometimes extending them with custom solutions.

Other Topics

1. Space Allocation

Two approaches: contiguous space (fast I/O but prone to fragmentation) and linked‑list space (low fragmentation but slower random reads). Index tables (i‑nodes) mitigate linked‑list drawbacks.

2. File Deletion

Logical deletion with delayed reclamation is common, allowing recovery before permanent removal.

3. Small‑File Distributed Systems

Store small files as metadata pointing to offsets within large data blocks, leveraging the efficiency of big‑block storage while keeping metadata lightweight.

4. File Fingerprinting & Deduplication

Hash‑based fingerprints (MD5, SHA‑256, SimHash, MinHash) identify identical content for deduplication, integrity checks, and version comparison.

Conclusion

Distributed file systems involve a wide range of considerations beyond those covered here; the article provides a concise analysis to guide future design decisions and encourages deeper exploration of specific solutions.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Scalabilityhigh availabilityData Consistencystorage architectureDistributed File System
Big Data Technology & Architecture
Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.