Why Traditional s3fs and goofys Fail Large‑File Backups and How US3FS Solves Them

This article examines the limitations of s3fs and goofys for large‑file backup scenarios, explains the design and implementation of the US3FS FUSE‑based file system, and presents benchmark results showing its superior performance and lower resource consumption.

UCloud Tech
UCloud Tech
UCloud Tech
Why Traditional s3fs and goofys Fail Large‑File Backups and How US3FS Solves Them

Introduction

To address reliability, capacity, and cost issues in data‑backup scenarios, many users prefer object storage. However, using US3 object storage directly for backups can be inconvenient, especially for database backups that require logical or physical dumps before uploading, and for log archiving that needs custom SDK code.

Open‑Source Solutions

Projects such as s3fs and goofys map object‑storage buckets to a file system via FUSE, but both exhibit problems in our large‑file backup use case.

s3fs

s3fs mounts an S3 bucket using FUSE. Testing revealed very poor performance for large‑file writes because it first writes to a local temporary file and then uploads data in multipart chunks. If disk space is insufficient, uploads become synchronous, as shown in the code snippet:

ssize_t FdEntity::Write(const char* bytes, off_t start, size_t size) {
    // no enough disk space
    if (0 != (result = NoCachePreMultipartPost())) {
        S3FS_PRN_ERR("failed to switch multipart uploading with no cache(errno=%d)", result);
        return static_cast<int>(result);
    }
    // start multipart uploading
    if (0 != (result = NoCacheLoadAndPost(0, start))) {
        S3FS_PRN_ERR("failed to load uninitialized area and multipart uploading it(errno=%d)", result);
        return static_cast<int>(result);
    }
}

Because our primary scenario involves large files, we abandoned s3fs.

goofys

goofys, written in Go, also mounts S3‑compatible storage but suffers from three main issues:

Write operations lack concurrency control; each file is split into chunks and written by separate goroutines, creating many simultaneous HTTP connections that consume excessive memory.

Read operations are forced into synchronous mode, resulting in poor performance. The code forces sync reads and limits pre‑read attempts to three, as shown:

if !fs.flags.Cheap && fh.seqReadAmount >= uint64(READAHEAD_CHUNK) && fh.numOOORead < 3 {
    err = fh.readAhead(uint64(offset), len(buf))
}

Additionally, goofys uses a fixed 4 MiB multipart size, which does not match US3’s requirements, and its rename operation is implemented by copying then deleting the source file because the S3 protocol lacks a native rename.

US3FS Design Overview

US3FS is a custom FUSE‑based file system that maps US3 buckets to POSIX‑compatible files. It implements VFS concepts and leverages FUSE to run in user space, reducing kernel‑level complexity while accepting the performance cost of additional user‑kernel transitions.

VFS and FUSE Basics

The Virtual File System (VFS) provides a uniform interface for user‑space applications, translating calls to underlying file‑system implementations. Dentries cache name‑to‑inode mappings, while inodes store metadata such as uid, gid, size, and mtime. FUSE moves file‑system logic to user space, communicating with the kernel via /dev/fuse.

FUSE architecture
FUSE architecture

Metadata Design

US3FS maintains an in‑memory directory tree for bucket objects and assigns a short TTL to metadata to avoid frequent HTTP calls. Inodes store only essential fields (uid, gid, size, mtime) and persist metadata using US3’s native object metadata feature.

Metadata mapping
Metadata mapping

IO Flow

For writes, US3FS caches data locally and uploads it in 4 MiB chunks using a token‑bucket limiter to control concurrency. The final chunk is uploaded when the file is closed.

For reads, US3FS implements a pre‑read algorithm that expands the read window exponentially up to a configured threshold, improving sequential read performance for large files. The kernel page cache further accelerates repeated reads.

Read prefetch algorithm
Read prefetch algorithm

Data Consistency

Because object storage does not make multipart data visible until the upload is completed, US3FS data is unreadable until the file is closed and the final multipart request is sent.

Benchmark Comparison

Under a 64‑thread, 4 MiB I/O workload, a 40 GiB file was tested for sequential write and read. US3FS showed stable memory usage (~305 MiB) compared to goofys (~3.3 GiB) and comparable write performance to s3fs, which suffers when local disk space is limited.

In sequential read tests, goofys performed poorly, while US3FS achieved orders‑of‑magnitude higher throughput. Moving a 1 GiB file demonstrated that US3FS can improve performance by hundreds of times in large‑file scenarios.

Benchmark results
Benchmark results
File move benchmark
File move benchmark

Conclusion

s3fs and goofys each have strengths and weaknesses for large‑file workloads, but the internally developed US3FS delivers superior read and write performance, tighter integration with US3, and easier extensibility.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

file systemFUSEobject storagegoofyss3fsUS3FS
UCloud Tech
Written by

UCloud Tech

UCloud is a leading neutral cloud provider in China, developing its own IaaS, PaaS, AI service platform, and big data exchange platform, and delivering comprehensive industry solutions for public, private, hybrid, and dedicated clouds.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.