Fundamentals 16 min read

Understanding FastDFS: A Lightweight Distributed File System

This article introduces the motivations for using a distributed file system, explains the architecture and core concepts of FastDFS—including tracker, storage, client, and group—covers its upload and download mechanisms, synchronization management, and the design of its file identifiers, providing a comprehensive overview for developers.

Full-Stack Internet Architecture

Nov 15, 2019

Understanding FastDFS: A Lightweight Distributed File System

In the previous article "A FastDFS Concurrency Issue Investigation Experience", the author described a production concurrency problem; this piece aims to give a complete introduction to FastDFS for readers unfamiliar with the software.

Why Use a Distributed File System?

Initially, projects often store static files directly in a project directory (e.g., resources\static\file or resources\static\img), which is simple but leads to coupling of files and code, messy storage, and resource contention under high traffic.

Introducing an independent file server separates files from the application server, allowing load balancing, easier scaling, disaster recovery, and caching strategies.

A distributed file system further solves single‑point‑of‑failure and storage‑capacity limits by providing high availability, elastic scaling, and data redundancy across multiple nodes.

FastDFS

FastDFS is an open‑source, lightweight distributed file system designed for storing large volumes of small to medium files (4 KB – 500 MB). It offers high performance, scalability, and APIs for C, Java, and PHP.

Key Concepts

FastDFS consists of three roles:

Tracker server : a lightweight coordinator that maintains in‑memory metadata about groups and storage servers, performs load balancing, and directs client requests.

Storage server : stores files and their metadata; organized into groups (or volumes) where each group contains multiple storage nodes with replicated data.

Client : uses proprietary APIs (upload, download, delete, etc.) over TCP/IP to interact with trackers and storage nodes.

Additional concepts include group (a collection of storage servers) and meta data (key‑value attributes such as width=1024, height=768).

Upload Mechanism

The client first contacts a tracker to obtain a storage server’s IP and port, then uploads the file to that storage server. The storage server writes the file to disk, generates a file_id, and returns the file ID, path, and name to the client.

Selection rules:

Tracker selection: round‑robin, specified group, or load‑balance based on free space.

Storage server selection within a group: round‑robin, IP order, or priority order.

Storage path selection: round‑robin among configured directories or the one with most free space.

After choosing a storage path, the storage server creates a two‑level 256×256 subdirectory hierarchy and stores the file using a hashed file ID. The final file name combines group, storage path, subdirectories, file ID, and the original file extension.

Download Mechanism

The client requests the tracker for the storage server’s address using the file name. The tracker parses the file name to determine the group and selects a suitable storage server based on synchronization status, preferring the original storage node or a node that has completed replication.

Synchronization Time Management

Each storage server periodically reports its latest synchronization timestamp to the tracker. The tracker uses these timestamps to decide which storage node can safely serve read requests, ensuring that the requested file has been fully replicated.

File ID (FID) Design

A FastDFS file ID encodes the group name, virtual disk path, two‑level data directories, and a generated file name that includes the source storage IP, creation timestamp, file size, a random number, and the file extension, enabling rapid location of the file on the storage server.

For deployment details, refer to the author's blog post on building a FastDFS cluster.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

backend storage Distributed File System FastDFS

Written by

Full-Stack Internet Architecture

Introducing full-stack Internet architecture technologies centered on Java

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.