Tagged articles
9 articles
Page 1 of 1
Baidu Geek Talk
Baidu Geek Talk
May 14, 2025 · Industry Insights

How RapidFS Boosts AI Model Training with 10 TiB/s Throughput

The article explains how large‑scale AI model training and inference require massive data handling, describes the RapidFS storage acceleration cluster deployed on a 30,000‑card Kunlun chip system with hundreds of domestic CPU servers, and presents performance tests showing linear throughput scaling up to over 1 TiB/s, demonstrating the impact of high‑performance storage on compute efficiency.

AI trainingHigh‑performance computingPerformance Testing
0 likes · 5 min read
How RapidFS Boosts AI Model Training with 10 TiB/s Throughput
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Apr 25, 2025 · Operations

How RapidFS Accelerates AI Model Training with 10 TiB/s Storage Performance

The article explains how RapidFS, a near‑compute storage acceleration solution built on BOS object storage, delivers up to 10 TiB/s throughput for massive AI model training, detailing its architecture, deployment on a 30,000‑card Kunlun cluster, and performance test results that show linear scaling from 20 to 70 nodes.

AI trainingHigh‑performance computingPerformance Testing
0 likes · 6 min read
How RapidFS Accelerates AI Model Training with 10 TiB/s Storage Performance
DataFunSummit
DataFunSummit
Nov 16, 2024 · Big Data

Data Lake Storage Acceleration: Evolution, Challenges, and Solutions for AI and Big Data Workloads

This article surveys the evolution of data‑lake storage acceleration, compares different architectural stages, analyzes why acceleration is needed for AI and big‑data scenarios, and details the key techniques—metadata acceleration, read/write speedup, and end‑to‑end workflow optimization—used to overcome performance and cost challenges.

AICloud Nativecaching
0 likes · 23 min read
Data Lake Storage Acceleration: Evolution, Challenges, and Solutions for AI and Big Data Workloads
Baidu Geek Talk
Baidu Geek Talk
Nov 13, 2024 · Industry Insights

Why Cloud‑Native Data Lakes Are the New Standard for Storage Acceleration

This article analyzes the evolution of data‑lake storage acceleration, compares traditional parallel file systems, object‑storage‑based solutions and modern cache‑enabled architectures, and explains how cloud‑native data lakes address scalability, cost, and performance challenges for AI and big‑data workloads.

AIBig DataCloud Native
0 likes · 24 min read
Why Cloud‑Native Data Lakes Are the New Standard for Storage Acceleration
Baidu Geek Talk
Baidu Geek Talk
Nov 4, 2024 · Big Data

Why Object Storage Is Replacing HDFS for Modern Data Lakes: Baidu’s 2.0 Acceleration

Data lakes have evolved from HDFS to object storage, addressing resource inefficiency, scalability limits, and operational burdens; Baidu’s Data Lake Storage Acceleration 2.0 introduces hierarchical Namespace 2.0, a streaming storage engine, RapidFS caching, and a fully HDFS‑compatible BOS‑HDFS layer to boost performance and support massive AI workloads.

AIBaiduBig Data
0 likes · 12 min read
Why Object Storage Is Replacing HDFS for Modern Data Lakes: Baidu’s 2.0 Acceleration
DataFunTalk
DataFunTalk
Nov 5, 2023 · Cloud Native

Cloud‑Native Storage Acceleration: Experience and Practices with CloudFS on Volcano Engine

This article presents the cloud‑native storage acceleration demands, evaluates what constitutes a good acceleration solution, and details the design, implementation, and real‑world practice of CloudFS—including metadata acceleration, data‑plane caching, FUSE enhancements, AI training and multi‑cloud data‑lake use cases—while outlining future roadmap plans.

AICloudFSKubernetes
0 likes · 15 min read
Cloud‑Native Storage Acceleration: Experience and Practices with CloudFS on Volcano Engine
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Oct 19, 2022 · Artificial Intelligence

Why Storage Systems Bottleneck AI Training and How to Accelerate Them

This article examines the comprehensive challenges AI applications face from storage to compute, traces the evolution of AI training infrastructure, analyzes key bottlenecks such as compute acceleration, resource scheduling, massive data handling and data flow, and presents Baidu Cloud's storage acceleration solutions—including parallel file systems, caching, and the Fluid scheduler—to dramatically improve AI training performance.

AI trainingCloud NativeData Lake
0 likes · 38 min read
Why Storage Systems Bottleneck AI Training and How to Accelerate Them