Fundamentals 7 min read

Exploring Popular Distributed File Systems: From GFS to FastDFS

This article surveys common distributed file systems such as GFS, HDFS, Lustre, Ceph, GridFS, MogileFS, TFS, and FastDFS, explaining their origins, key characteristics, typical use cases, and practical considerations for large‑scale storage.

ITFLY8 Architecture Home
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Exploring Popular Distributed File Systems: From GFS to FastDFS

Common distributed file systems include GFS, HDFS, Lustre, Ceph, GridFS, MogileFS, TFS, FastDFS, and others, each suited to different domains.

They are application‑level distributed storage services rather than system‑level file systems.

Google’s research papers (Google File System, MapReduce, BigTable, Chubby) are the origin of many of these systems.

Key systems:

GFS (Google File System) : Proprietary, built for Google’s internal needs; source code not open‑source.

HDFS : Hadoop Distributed File System, an open‑source implementation inspired by GFS; part of the Hadoop ecosystem.

Ceph : Developed at UC Santa Cruz; written in C++, supports FUSE, avoids single points of failure, but depends on Btrfs and is considered not mature for production.

Lustre : High‑availability cluster file system from Sun, designed for large‑scale clusters (10,000+ nodes, petabyte storage).

FastDFS : Open‑source lightweight DFS written in C, suitable for storing large numbers of small files such as images or videos.

TFS (Taobao File System) : High‑scalable, high‑availability DFS used by Taobao for massive small‑file storage, built on Linux clusters with HA architecture.

GridFS : Built‑in file storage layer of MongoDB; splits files into 4 MB chunks stored in two collections, providing both data and metadata storage.

MogileFS : Perl‑based DFS from Danga, used by sites like Yupoo; consists of a tracker (mogilefsd) and storage nodes, with client APIs for Perl and PHP.

MooseFS : FUSE‑based lightweight DFS with a single master dependency, written in Perl.

Additional resources and download links are available on various forums and code repositories.

hdfsCephGFSdistributed file systems
ITFLY8 Architecture Home
Written by

ITFLY8 Architecture Home

ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.