Big Data 11 min read

High‑Performance Computing Platforms for Translational Medicine: The ASTRA System and DAOS Storage

The article explains how the ASTRA high‑performance computing platform, built for translational medicine, integrates a three‑tier storage architecture with Intel's DAOS to overcome data‑intensive challenges, improve AI and big‑data workloads, and achieve top rankings on the IO500 benchmark.

Architects' Tech Alliance
Architects' Tech Alliance
Architects' Tech Alliance
High‑Performance Computing Platforms for Translational Medicine: The ASTRA System and DAOS Storage

High‑performance computing (HPC) originally served national scientific projects but has expanded into industry and many fields such as finance, weather forecasting, and life sciences, where it now underpins translational medicine research.

Science/AAAS and Intel launched the second season of the "Architects Growth Program" with a course titled "High‑Performance Computing Platforms in the Context of Translational Medicine," featuring speakers from Ruijin Hospital and Intel.

The ASTRA platform was created to support large‑scale genomic, transcriptomic, epigenomic, and pharmacokinetic analyses, offering 4,000 CPU cores, a 10 PB parallel file system, 200 GB HDR IB networking, and 15 PetaFLOPS of AI compute, using a hybrid CPU‑GPU architecture optimized for scientific workloads.

To address storage bottlenecks, ASTRA adopts a three‑layer storage hierarchy: backup storage for infrequently accessed raw data, a commercial parallel file system for completed analyses, and DAOS (Distributed Asynchronous Object Storage) for hot data.

DAOS, an open‑source file system from Intel, provides superior random I/O, metadata performance, and low‑latency writes, enabling ASTRA to achieve 85 GB/s throughput and rank eighth on the 2021 IO500 ten‑node list.

DAOS stores metadata on Intel Optane persistent memory and large data blocks on NVMe SSDs, supports multi‑replica, erasure‑coded, and sharded object layouts, and offers billions of IOPS with high bandwidth, making it suitable for AI and big‑data workloads as well as HPC.

Integrations such as DAOS‑TensorFlow enable direct data loading for AI, while Intel is developing WORM containers and a DAOS Pipeline API to move data‑intensive processing to storage.

A round‑table discussion with the three experts highlighted the challenges of combining high compute power with fast storage, the limitations of traditional file systems, and the practical benefits of DAOS for both HPC and emerging AI/big‑data applications.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AIstorageHPCASTRADAOSTranslational Medicine
Architects' Tech Alliance
Written by

Architects' Tech Alliance

Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.