Fundamentals 16 min read

HadaFS: A Scalable Burst Buffer File System for Exascale Supercomputers

The article introduces HadaFS, a novel burst‑buffer file system that combines the scalability and performance of local burst buffers with the data‑sharing and cost advantages of shared buffers, details its LTA architecture, metadata handling, and evaluates its superior performance on the SNS supercomputer against BeeGFS and traditional GFS solutions.

Architects' Tech Alliance

Jun 13, 2023

HadaFS: A Scalable Burst Buffer File System for Exascale Supercomputers

This paper, presented at FAST 2023 by researchers from the National Supercomputing Center in Wuxi, Tsinghua University, Shandong University, and the Chinese Academy of Engineering, reports on HadaFS, a new burst‑buffer file system designed for high‑performance computing (HPC) environments.

Background: HPC workloads demand ever‑increasing I/O bandwidth, leading to the adoption of burst buffers (BB) deployed either locally on compute nodes or as shared resources. Local BBs offer scalability and performance but suffer from poor data sharing and high deployment cost, while shared BBs provide data sharing and lower cost but struggle with scalability.

Motivation: Existing BB solutions face challenges in scaling to exascale I/O concurrency, flexible consistency semantics, and dynamic data migration. HadaFS aims to unify the benefits of local and shared BBs to meet the needs of future supercomputers.

Design and Implementation: HadaFS introduces a Localized Triage Architecture (LTA) where each client connects to a single bridge server that forwards I/O requests, providing local‑BB‑like performance while enabling global sharing through a fully connected server mesh. It uses full‑path indexing for metadata, storing file metadata in key‑value stores (LMDB for local metadata, GMDB for global metadata) backed by RocksDB. The system includes a data‑management tool, Hadash, for metadata queries and data migration.

Metadata Synchronization: Three modes are offered—mode1 (asynchronous, eventual consistency), mode2 (mixed synchronous/asynchronous), and mode3 (synchronous for all operations)—allowing applications to balance consistency requirements against performance.

Performance Evaluation: Deployed on the SNS supercomputer (over 100,000 compute nodes, 600,000 HadaFS clients), HadaFS was benchmarked against BeeGFS and the native GFS (Lustre/LWFS). Metadata benchmarks (MDTest) showed HadaFS outperforming both competitors, especially at large scales. I/O bandwidth tests (IOR) demonstrated that HadaFS achieved near‑SSD theoretical limits for reads and significantly higher throughput than BeeGFS and GFS for writes. Data‑migration tests with Hadash outperformed Datawarp, achieving up to 140 GB/s for stage‑out operations and handling millions of small files efficiently.

Conclusion: HadaFS delivers a scalable, high‑performance burst‑buffer solution with flexible consistency, integrated data‑management, and proven capability to serve hundreds of applications on a supercomputer with up to 600,000 clients and an aggregate I/O bandwidth of 3.1 TB/s.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Metadata File System performance evaluation scalable storage HPC Burst Buffer SNS supercomputer

Written by

Architects' Tech Alliance

Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.