High‑Performance Computing Platforms for Translational Medicine: The ASTRA System and DAOS Storage
The article explains how the ASTRA high‑performance computing platform, built for translational medicine, integrates a three‑tier storage architecture with Intel's DAOS to overcome data‑intensive challenges, improve AI and big‑data workloads, and achieve top rankings on the IO500 benchmark.
High‑performance computing (HPC) originally served national scientific projects but has expanded into industry and many fields such as finance, weather forecasting, and life sciences, where it now underpins translational medicine research.
Science/AAAS and Intel launched the second season of the "Architects Growth Program" with a course titled "High‑Performance Computing Platforms in the Context of Translational Medicine," featuring speakers from Ruijin Hospital and Intel.
The ASTRA platform was created to support large‑scale genomic, transcriptomic, epigenomic, and pharmacokinetic analyses, offering 4,000 CPU cores, a 10 PB parallel file system, 200 GB HDR IB networking, and 15 PetaFLOPS of AI compute, using a hybrid CPU‑GPU architecture optimized for scientific workloads.
To address storage bottlenecks, ASTRA adopts a three‑layer storage hierarchy: backup storage for infrequently accessed raw data, a commercial parallel file system for completed analyses, and DAOS (Distributed Asynchronous Object Storage) for hot data.
DAOS, an open‑source file system from Intel, provides superior random I/O, metadata performance, and low‑latency writes, enabling ASTRA to achieve 85 GB/s throughput and rank eighth on the 2021 IO500 ten‑node list.
DAOS stores metadata on Intel Optane persistent memory and large data blocks on NVMe SSDs, supports multi‑replica, erasure‑coded, and sharded object layouts, and offers billions of IOPS with high bandwidth, making it suitable for AI and big‑data workloads as well as HPC.
Integrations such as DAOS‑TensorFlow enable direct data loading for AI, while Intel is developing WORM containers and a DAOS Pipeline API to move data‑intensive processing to storage.
A round‑table discussion with the three experts highlighted the challenges of combining high compute power with fast storage, the limitations of traditional file systems, and the practical benefits of DAOS for both HPC and emerging AI/big‑data applications.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architects' Tech Alliance
Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
