Understanding Burst Buffer Technology and Its Role in HPC at NERSC
The article explains what Burst Buffer technology is, how NERSC integrates Cray DataWarp‑based flash/SSD buffers into its Cori supercomputer to boost I/O bandwidth and IOPS for scientific workloads, and describes the architecture, software stack, performance benefits, roadmap, and competing solutions from other vendors.
Burst Buffer is a storage‑class memory technology that sits between compute nodes and parallel file systems, providing a fast flash/SSD layer to absorb I/O bursts; this article introduces the concept and its relevance to high‑performance computing (HPC) by examining the U.S. National Energy Research Scientific Computing Center (NERSC).
NERCS collaborates with Cray to equip its latest Cori system with Cray DataWarp‑based Burst Buffers, using flash or SSD devices to dramatically improve I/O performance for large scientific applications.
NERCS’s mission is to accelerate DOE scientific discovery through massive computational modeling in areas such as photosynthesis, climate, combustion, magnetic fusion, astrophysics, and computational biology, all of which generate bursty I/O demands.
Burst Buffers increase the total bandwidth available to applications and raise the IOPS of the underlying file system, enabling faster checkpoint/restart, quicker small‑block transfers, temporary high‑speed storage for external applications, and large‑file staging for coupled simulations.
The architecture places the Burst Buffer layer physically between compute and storage nodes; it resides on dedicated XC40 nodes running the DataWarp software stack, with SSDs installed in each node and managed by the scheduler.
DataWarp connects flash disks to XC40 nodes via PCIe; it supports Lustre, GPFS, and PanFS parallel file systems, providing a global flash cache layer and using intelligent scheduling to prefetch data from the parallel file system.
Each Burst Buffer node contains an Intel Xeon processor, 64 GB DDR3 memory, and two 3.2 TB NAND SSD modules connected via PCIe Gen3 x8; the node links to the Cray Aries network via PCIe Gen3 x16, offering roughly 6.4 TB of capacity and a peak sequential bandwidth of about 5.7 GB/s.
The DataWarp software stack creates mount‑point services, LVM volumes, XFS file systems, and the DataWarp File System (DWFS) that presents a unified namespace to compute nodes, handling data movement in and out of the Burst Buffer.
Key features include scheduler integration for allocating Burst Buffer resources, a transparent cache mode that accelerates large Lustre file systems without code changes, and on‑node data filtering/analysis capabilities.
The Burst Buffer roadmap consists of four delivery stages; the first stage was released in fall 2015 alongside Cori, with an early‑access program that gathered successful application use cases.
Through the Slurm batch system, users can request Burst Buffer allocations (size, striping, persistence) via the DataWarp API; the Cori Burst Buffer provides roughly 1.7 TB/s peak I/O, 28 M IOPs, and about 1.8 PB of storage capacity.
Cray’s broader HPC portfolio includes the XC40 and CS400 supercomputers, which combine Haswell CPUs, NVIDIA Tesla GPUs, and Intel Xeon accelerators, and leverage DataWarp Burst Buffers for enhanced I/O.
Other vendors such as DDN and EMC also develop Burst Buffer solutions; DDN’s IME and EMC’s Active Burst Buffer Appliance (aBBa) provide similar flash‑based acceleration, supporting multiple parallel file systems (Lustre, Isilon, PanFS, HDFS, VNX) and delivering up to 30 % overall compute performance gains.
In summary, Burst Buffer technology offers a cost‑effective way to balance storage investment and performance for bursty HPC workloads, leveraging a modest amount of SSD capacity to deliver high peak I/O while allowing the underlying parallel file system to handle baseline traffic.
Architects' Tech Alliance
Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.