Tagged articles
19 articles
Page 1 of 1
Baidu Geek Talk
Baidu Geek Talk
Mar 23, 2026 · Databases

How Baidu’s MEG Platform Revamped ClickHouse with a Lakehouse Architecture

This article analyzes the challenges of scaling ClickHouse within Baidu’s MEG data platform and details a lake‑house solution that decouples storage and compute, integrates a meta‑service for transparent data access, optimizes query performance through caching, data roll‑up and layout tuning, and introduces a unified query gateway that gracefully falls back to Spark for complex workloads.

ClickHouseData PlatformLakehouse
0 likes · 25 min read
How Baidu’s MEG Platform Revamped ClickHouse with a Lakehouse Architecture
Big Data Technology Tribe
Big Data Technology Tribe
Jul 28, 2025 · Fundamentals

How Speculative Path Resolution Cuts Metadata Latency in InfiniFS

This article explains InfiniFS's speculative path resolution, detailing how predictable directory IDs and parallel lookups transform traditional linear RPC-based path traversal into constant‑time operations, dramatically reducing metadata access latency in large, deep directory trees.

Distributed File SystemInfiniFSmetadata service
0 likes · 8 min read
How Speculative Path Resolution Cuts Metadata Latency in InfiniFS
ByteDance Cloud Native
ByteDance Cloud Native
Mar 13, 2025 · Backend Development

Inside DeepSeek 3FS: Architecture of a High‑Performance Parallel File System

This article dissects DeepSeek's 3FS parallel file system, detailing its four‑component architecture, high‑throughput RDMA networking, metadata handling with FoundationDB, client access methods, chain replication (CRAQ), custom FFRecord format, and recovery mechanisms, offering a deep technical perspective for storage engineers.

Distributed File SystemHigh-performance storageRDMA
0 likes · 22 min read
Inside DeepSeek 3FS: Architecture of a High‑Performance Parallel File System
Volcano Engine Developer Services
Volcano Engine Developer Services
Mar 7, 2025 · Operations

Inside 3FS: How DeepSeek’s Parallel File System Powers AI Training

This article dives deep into DeepSeek's 3FS parallel file system, detailing its four-component architecture, RDMA‑based high‑speed networking, client options, metadata and storage services, replication protocols, dynamic stripe sizing, and recovery mechanisms that enable efficient AI model training and inference.

AI trainingDistributed File SystemRDMA
0 likes · 21 min read
Inside 3FS: How DeepSeek’s Parallel File System Powers AI Training
AntData
AntData
Mar 4, 2025 · Big Data

Design and Analysis of 3FS: An AI‑Optimized Distributed File System

The article provides a comprehensive English overview of 3FS, an AI‑focused distributed file system that leverages FoundationDB for metadata, CRAQ for chunk replication, and a hybrid Fuse/native client architecture, detailing its design, components, fault handling, and performance considerations for large‑scale training workloads.

AI storageCRAQ replicationDistributed File System
0 likes · 25 min read
Design and Analysis of 3FS: An AI‑Optimized Distributed File System
Didi Tech
Didi Tech
Sep 5, 2024 · Industry Insights

How Didi Built a Multi‑Protocol, Petabyte‑Scale Storage System for AI Training

Facing petabyte‑level data, billions of small files, and the need for POSIX, S3, and HDFS compatibility, Didi designed a new generation of non‑structured storage—OrangeFS—by analyzing internal systems, combining multiple storage solutions, reusing GIFT technology, and implementing a high‑performance metadata service, multi‑protocol fusion, and robust scalability features.

AI storageBig DataCloud Native
0 likes · 27 min read
How Didi Built a Multi‑Protocol, Petabyte‑Scale Storage System for AI Training
vivo Internet Technology
vivo Internet Technology
Sep 27, 2023 · Big Data

Horizontal Scaling of Hive Metastore Service at Vivo: Evaluation, TiDB Migration, and Lessons Learned

Vivo’s big‑data team horizontally scaled its Hive Metastore by evaluating MySQL sharding (Waggle‑Dance) against a TiDB migration, ultimately adopting TiDB, which after a synchronized cut‑over delivered ~15% faster queries, 80% DDL latency reduction, linear scaling, low resource use, and valuable operational lessons.

Big DataHive MetastoreSQL
0 likes · 19 min read
Horizontal Scaling of Hive Metastore Service at Vivo: Evaluation, TiDB Migration, and Lessons Learned
DataFunTalk
DataFunTalk
Sep 17, 2023 · Cloud Native

REDck: A Cloud‑Native Real‑Time Data Warehouse Built on ClickHouse

REDck is a cloud‑native, storage‑compute separated real‑time OLAP data warehouse derived from ClickHouse that addresses scalability, operational cost, and reliability challenges through a unified metadata service, object‑storage optimizations, multi‑level caching, distributed task scheduling, and two‑phase commit transactions.

ClickHouseDistributed TransactionsReal-time OLAP
0 likes · 18 min read
REDck: A Cloud‑Native Real‑Time Data Warehouse Built on ClickHouse
DataFunTalk
DataFunTalk
Sep 15, 2023 · Cloud Computing

Design and Architecture of Baidu CFS Large‑Scale Distributed File System and Metadata Service

The talk from DataFun Summit 2023 explains how Baidu's CFS storage builds a trillion‑file‑scale distributed file system by revisiting file system fundamentals, POSIX limitations, historical storage architectures, and introducing a lock‑free metadata service with single‑shard primitives, data‑layout optimizations, and a simplified client‑centric architecture that achieves high scalability and performance.

CFSDistributed File SystemPOSIX
0 likes · 31 min read
Design and Architecture of Baidu CFS Large‑Scale Distributed File System and Metadata Service
ITPUB
ITPUB
Sep 11, 2023 · Cloud Native

How REDck Transforms ClickHouse into a Scalable Cloud‑Native Real‑Time Data Warehouse

Xiaohongshu built REDck, a cloud‑native, storage‑compute separated real‑time OLAP warehouse on ClickHouse, addressing scaling, cost, and reliability challenges through a unified metadata service, object‑storage optimizations, multi‑level caching, distributed task scheduling, bucketing, and exactly‑once transaction support.

ClickHouseDistributed TransactionsReal-time OLAP
0 likes · 21 min read
How REDck Transforms ClickHouse into a Scalable Cloud‑Native Real‑Time Data Warehouse
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Aug 29, 2023 · Cloud Computing

How Baidu CFS Scales to Billions of Files with a Lock‑Free Metadata Service

This article explains Baidu's CFS architecture for building a billion‑file‑scale distributed file system, covering basic file system concepts, POSIX limitations, metadata service modeling, performance metrics, evolution of metadata architectures, and CFS's lock‑free design that achieves high scalability, low latency, and balanced load in cloud storage.

Distributed File SystemScalabilitycloud storage
0 likes · 32 min read
How Baidu CFS Scales to Billions of Files with a Lock‑Free Metadata Service
Baidu Geek Talk
Baidu Geek Talk
May 29, 2023 · Backend Development

CFS: Scaling Metadata Service for Distributed File System via Pruned Scope of Critical Sections - Baidu's Implementation Journey

Baidu’s CFS metadata service scales to billions of files by shrinking critical sections through a lock‑free Namespace 2.0 design that confines conflicts to single shards, uses field‑level atomic primitives, and integrates the proxy into the client, delivering up to 76× throughput gains and significant latency reductions in production.

Baidu CFSDistributed File SystemEuroSys 2023
0 likes · 40 min read
CFS: Scaling Metadata Service for Distributed File System via Pruned Scope of Critical Sections - Baidu's Implementation Journey
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
May 25, 2023 · Cloud Native

How Baidu’s CFS Achieved Billion‑File Scale with a Lock‑Free Metadata Service

This article explains the design and evolution of Baidu Cloud File System's (CFS) metadata service, detailing how a novel lock‑free architecture and strategic data layout enable POSIX‑compatible, highly scalable storage that can handle billions of files while maintaining high performance and consistency.

Distributed File SystemScalabilitycloud storage
0 likes · 42 min read
How Baidu’s CFS Achieved Billion‑File Scale with a Lock‑Free Metadata Service
UCloud Tech
UCloud Tech
Nov 21, 2022 · Cloud Computing

How UCloud Revamped US3 Metadata Service for 80% Cost Savings and Faster Performance

UCloud’s US3 object storage metadata service, originally built on a chained MongoDB architecture, faced scalability, performance, and cost challenges, prompting a redesign that introduces a high‑compatibility DB‑Gateway, a distributed KV store (UKV) with custom RocksDB, delivering faster reads, zero list‑service latency, 80% cost reduction, and simpler operations.

Distributed KVPerformance Optimizationcloud architecture
0 likes · 8 min read
How UCloud Revamped US3 Metadata Service for 80% Cost Savings and Faster Performance
YunZhu Net Technology Team
YunZhu Net Technology Team
Jan 26, 2022 · Databases

Graph Database Selection and NebulaGraph Architecture for a Knowledge‑Graph Platform

The article explains how the cloud‑construction platform evaluated graph‑database options based on open‑source, scalability, latency, storage capacity and import capabilities, ultimately choosing NebulaGraph, and then details NebulaGraph’s distributed meta, storage and query services as well as the overall multi‑layer knowledge‑graph platform architecture and future application scenarios.

Knowledge GraphNebulaGraphQuery Service
0 likes · 11 min read
Graph Database Selection and NebulaGraph Architecture for a Knowledge‑Graph Platform
dbaplus Community
dbaplus Community
Aug 27, 2019 · Big Data

How eBay Scales Real‑Time Monitoring with Flink: Metadata‑Driven Streaming

This article explains how eBay’s Sherlock.IO monitoring platform processes billions of logs, events, and metrics daily using Flink Streaming jobs, detailing a metadata‑driven architecture, shared job strategies, Heartbeat‑based monitoring, job isolation, back‑pressure handling, and real‑world use cases such as Event Alerting, Eventzon, and Netmon.

Big DataFlinkReal-time Processing
0 likes · 18 min read
How eBay Scales Real‑Time Monitoring with Flink: Metadata‑Driven Streaming