Tagged articles
43 articles
Page 1 of 1
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Apr 7, 2026 · Cloud Native

Boost OpenStack Storage Efficiency with Ceph RBD Erasure Coding

This article explains how to integrate Ceph's erasure‑coded RBD pools with OpenStack, covering the design principles, storage pool layout, performance trade‑offs, and step‑by‑step configuration for Nova and Cinder to achieve higher storage utilization while maintaining high availability.

CephHybrid ArchitectureOpenStack
0 likes · 13 min read
Boost OpenStack Storage Efficiency with Ceph RBD Erasure Coding
MaGe Linux Operations
MaGe Linux Operations
Feb 28, 2026 · Cloud Computing

Deploying MinIO: A Complete Guide to Private S3‑Compatible Object Storage

This guide explains why traditional block and file storage struggle with massive unstructured data, introduces MinIO as a high‑performance, Go‑based S3‑compatible object storage, and provides step‑by‑step instructions for single‑node and erasure‑coded multi‑node deployments, TLS setup, client usage, policies, monitoring, backup, and troubleshooting.

BackupKubernetesMinio
0 likes · 35 min read
Deploying MinIO: A Complete Guide to Private S3‑Compatible Object Storage
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Jun 6, 2025 · Fundamentals

How Erasure Coding Cuts Storage Costs in Ozone: A Deep Dive

This article explains how Erasure Coding (EC) improves data reliability and dramatically reduces storage overhead in Ozone by leveraging hot‑cold data characteristics, intelligent tiering, dynamic EC ratios, and repair throttling, while also discussing performance trade‑offs and limitations.

Data ReliabilityOzoneStorage Optimization
0 likes · 9 min read
How Erasure Coding Cuts Storage Costs in Ozone: A Deep Dive
DataFunSummit
DataFunSummit
Jan 16, 2025 · Big Data

Zhihu Big Data Cost‑Reduction Practices: FinOps, Erasure Coding, ZSTD Compression, Spark Auto‑Tuning, and Remote Shuffle Service

This article details Zhihu's comprehensive cost‑reduction and efficiency‑boosting initiatives for its big‑data platform, covering FinOps‑driven financial operations, hybrid‑cloud architecture, cost allocation models, operational monitoring, and technical optimizations such as erasure coding, ZSTD compression, Spark auto‑tuning, and a remote shuffle service.

Big DataCloud Cost ManagementCost Optimization
0 likes · 22 min read
Zhihu Big Data Cost‑Reduction Practices: FinOps, Erasure Coding, ZSTD Compression, Spark Auto‑Tuning, and Remote Shuffle Service
Baidu Tech Salon
Baidu Tech Salon
Nov 8, 2024 · Cloud Computing

Design and Evolution of Baidu Canghai Storage Unified Technology Stack

Baidu Canghai Storage’s unified technology stack—comprising a meta‑aware distributed metadata layer, a hybrid single‑node‑distributed namespace, and an online erasure‑coding data layer—delivers AI‑driven, high‑performance, low‑cost, ZB‑scale cloud storage by modularizing metadata, namespace, and data services for object, file, and block workloads.

BaiduDistributed SystemsMicroservices
0 likes · 16 min read
Design and Evolution of Baidu Canghai Storage Unified Technology Stack
Baidu Geek Talk
Baidu Geek Talk
Nov 6, 2024 · Cloud Computing

Baidu Canghai Storage Unified Technology Base: Architecture and Evolution of Metadata, Namespace, and Data Layers

Baidu’s Canghai Storage unifies metadata, hierarchical namespace, and data layers into a Meta‑Aware, three‑generation architecture that scales to trillions of metadata items and zettabyte‑scale data, using a distributed transactional KV store, single‑machine‑distributed namespace, and online erasure‑coding micro‑services to deliver high performance, low cost, and seamless scalability.

Big DataDistributed SystemsNewSQL
0 likes · 18 min read
Baidu Canghai Storage Unified Technology Base: Architecture and Evolution of Metadata, Namespace, and Data Layers
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Nov 4, 2024 · Cloud Computing

How Baidu’s Unified Storage Platform Tackles AI‑Era Data Challenges

This article details Baidu’s unified storage architecture—covering its metadata, hierarchical namespace, and data layers—explaining how meta‑aware design, custom partitioning, flexible engines, and micro‑service based erasure coding together meet the scalability, performance, and cost demands of modern AI‑driven cloud storage workloads.

Microservicescloud storageerasure coding
0 likes · 17 min read
How Baidu’s Unified Storage Platform Tackles AI‑Era Data Challenges
DataFunTalk
DataFunTalk
Aug 30, 2023 · Big Data

Design and Implementation of Baidu Cloud Block Storage EC System for Large‑Scale Data

This article presents Baidu Cloud's block storage architecture, comparing replication and erasure‑coding fault‑tolerance methods, detailing the challenges of applying EC to mutable block data, and describing a two‑layer append‑engine solution with selective 3‑replica caching, cost‑benefit compaction, and performance optimizations for low‑cost, high‑throughput storage.

Big Dataappend engineblock storage
0 likes · 14 min read
Design and Implementation of Baidu Cloud Block Storage EC System for Large‑Scale Data
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Aug 22, 2023 · Fundamentals

How Baidu’s CDS Uses Erasure Coding to Cut Storage Costs and I/O Amplification

This article explains Baidu Intelligent Cloud's block storage (CDS) architecture, comparing fault‑tolerance methods, detailing the challenges of large‑scale erasure‑coded storage, and describing Baidu's two‑layer append‑engine solution that reduces I/O amplification while keeping costs low.

I/O amplificationStorage Optimizationappend engine
0 likes · 15 min read
How Baidu’s CDS Uses Erasure Coding to Cut Storage Costs and I/O Amplification
vivo Internet Technology
vivo Internet Technology
Jun 7, 2023 · Big Data

Erasure Coding Technology in the Evolution of Vivo Storage Systems

Combining academic advances and industry practice, the article surveys erasure‑coding techniques, then details Vivo’s optimized storage stack—enhancing Reed‑Solomon with bit‑matrix scheduling, parallel cross‑AZ repair, LRC and MSR layers, and intermediate‑result optimization—to achieve high reliability while minimizing bandwidth and storage overhead.

Regenerating CodesReliabilitydata redundancy
0 likes · 48 min read
Erasure Coding Technology in the Evolution of Vivo Storage Systems
Bilibili Tech
Bilibili Tech
Mar 14, 2023 · Big Data

Bilibili HDFS Erasure Coding Strategy and Implementation

Bilibili reduced petabyte‑scale storage costs by back‑porting erasure‑coding patches to its HDFS 2.8.4 cluster, deploying a parallel EC‑enabled cluster, adding a data‑proxy service, intelligent routing and block‑checking, and automating cold‑data migration, while noting write overhead and planning native acceleration.

Big DataData ReliabilityDistributed Systems
0 likes · 14 min read
Bilibili HDFS Erasure Coding Strategy and Implementation
ITPUB
ITPUB
Dec 25, 2022 · Cloud Native

How to Build a Scalable Distributed File System with MinIO

This guide explains the fundamentals of distributed file systems, compares them with traditional storage, introduces MinIO’s architecture and features, and provides step‑by‑step instructions for deploying a multi‑node MinIO cluster with Nginx load balancing on Linux.

DeploymentLinuxMinio
0 likes · 16 min read
How to Build a Scalable Distributed File System with MinIO
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Nov 28, 2022 · Cloud Computing

How Baidu’s ARIES Powers Exabyte-Scale Cloud Storage for Baidu Netdisk

This article presents a comprehensive overview of Baidu’s ARIES storage platform, detailing its design philosophy, architecture, key concepts, and engineering challenges, and explains how it underpins Baidu Netdisk’s massive data‑plane storage with high availability, cost‑performance trade‑offs, and robust monitoring.

Distributed SystemsResource Managementcloud storage
0 likes · 36 min read
How Baidu’s ARIES Powers Exabyte-Scale Cloud Storage for Baidu Netdisk
ITPUB
ITPUB
Oct 20, 2022 · Big Data

Will HDFS Be Replaced? Analyzing Its Drawbacks and Future Alternatives

The article examines why Hadoop's Distributed File System may become obsolete by detailing its three main shortcomings—deployment complexity, metadata memory limits, and high replication overhead—and explores how newer architectures and erasure coding could address these issues.

Big DataDistributed File SystemHDFS
0 likes · 8 min read
Will HDFS Be Replaced? Analyzing Its Drawbacks and Future Alternatives
DataFunTalk
DataFunTalk
Jul 4, 2022 · Big Data

Apache Ozone: Architecture, Advantages, and New Features Overcoming HDFS Limitations

This article explains the shortcomings of HDFS at large scale, describes the Federation and Scaling approaches, and details how Apache Ozone redesigns metadata storage, introduces container abstraction, object semantics, and new features such as optimized OM, streaming writes, erasure coding, and RocksDB consolidation to improve scalability and performance.

Apache OzoneHDFSRocksDB
0 likes · 11 min read
Apache Ozone: Architecture, Advantages, and New Features Overcoming HDFS Limitations
Bilibili Tech
Bilibili Tech
May 20, 2022 · Backend Development

Design and Implementation of Bilibili Object Storage Service (BOSS): Architecture, Topology, Metadata, Erasure Coding, and Scaling

The article chronicles Bilibili’s 13‑day development of BOSS, a custom object storage service, detailing how it replaces MySQL‑based routing and ID generation with replicated etcd or Raft KV stores, models metadata via protobuf, adopts erasure coding and a Bitcask‑style engine, and implements safe delete, replica repair, and horizontal scaling for a resilient large‑scale system.

Distributed Systemserasure codingmetadata design
0 likes · 28 min read
Design and Implementation of Bilibili Object Storage Service (BOSS): Architecture, Topology, Metadata, Erasure Coding, and Scaling
Kuaishou Big Data
Kuaishou Big Data
Oct 28, 2021 · Big Data

How Kuaishou Cut Object Storage Costs by 50% with LRC Erasure Coding

Kuaishou reduced half of its massive object storage expenses by redesigning its architecture to use HBase indexing, HDFS large‑file storage, MemoryCache, and a cross‑IDC LRC erasure‑coding warm layer that maintains disaster‑recovery while dynamically moving data from hot to warm to cold tiers.

Big DataKuaishouLRC
0 likes · 12 min read
How Kuaishou Cut Object Storage Costs by 50% with LRC Erasure Coding
Big Data Technology & Architecture
Big Data Technology & Architecture
Oct 8, 2021 · Big Data

Hadoop HDFS Storage Optimization, Erasure Coding, Heterogeneous Storage, and Cluster Tuning Guide

This article provides a comprehensive guide to optimizing Hadoop HDFS storage through erasure coding and heterogeneous storage policies, explains fault‑tolerance techniques such as safe mode and slow‑disk monitoring, and shares practical MapReduce performance tuning and enterprise‑level configuration examples for large‑scale clusters.

Cluster TuningHDFSHadoop
0 likes · 32 min read
Hadoop HDFS Storage Optimization, Erasure Coding, Heterogeneous Storage, and Cluster Tuning Guide
vivo Internet Technology
vivo Internet Technology
Jul 28, 2021 · Industry Insights

How to Quantify Data Reliability in Distributed Storage Systems

This article analyzes the quantitative model for data reliability in distributed storage, covering factors such as disk count, replication factor, recovery time, annualized failure rate, and copyset configuration, and derives formulas to estimate yearly data loss probability for both replica and erasure‑coding schemes.

AFRData Reliabilitycopyset
0 likes · 16 min read
How to Quantify Data Reliability in Distributed Storage Systems
58 Tech
58 Tech
May 28, 2021 · Big Data

Practical Upgrade Experience of Hadoop 3.2.1 in 58.com Data Platform: HDFS, YARN, and MR3

This article details the end‑to‑end upgrade of a 5000‑node Hadoop 2.6.0 cluster to Hadoop 3.2.1 at 58.com, covering HDFS migration, RBF and EC adoption, Yarn federation and rolling upgrades, MR3 integration, extensive compatibility testing, and operational lessons learned for large‑scale big‑data platforms.

Big DataCluster UpgradeHDFS
0 likes · 19 min read
Practical Upgrade Experience of Hadoop 3.2.1 in 58.com Data Platform: HDFS, YARN, and MR3
Big Data Technology Architecture
Big Data Technology Architecture
Mar 25, 2021 · Big Data

Implementing Erasure Coding in HDFS: Migration, Testing, and Data Lifecycle Management at JD

This article details JD's end‑to‑end implementation of HDFS erasure coding, covering the migration from replication to EC, the three‑phase upgrade and rollback process, comprehensive automated testing, a custom data‑lifecycle management system for hot‑warm‑cold data, and multi‑layer integrity safeguards to achieve significant storage cost reduction while maintaining reliability.

Data LifecycleHDFSStorage Optimization
0 likes · 17 min read
Implementing Erasure Coding in HDFS: Migration, Testing, and Data Lifecycle Management at JD
JD Tech
JD Tech
Mar 20, 2021 · Big Data

Implementing Erasure Coding in HDFS: Migration Strategy, Testing Framework, and Data Lifecycle Management

This article details JD's practical experience migrating HDFS to erasure coding, covering the decision between upgrade and porting, the step‑by‑step upgrade and rollback procedures, automated testing, a custom data‑lifecycle management system for hot‑warm‑cold data, and comprehensive data‑integrity safeguards to achieve significant storage cost reductions while maintaining production reliability.

Cluster UpgradeData Lifecycle ManagementHDFS
0 likes · 17 min read
Implementing Erasure Coding in HDFS: Migration Strategy, Testing Framework, and Data Lifecycle Management
dbaplus Community
dbaplus Community
Mar 17, 2021 · Big Data

How We Cut PBs of Waste and Optimized HDFS with Tiered Storage and Cloud Migration

This article details a three‑part technical sharing that covers cost governance for offline Hadoop clusters, a large‑scale data‑center migration with architecture upgrades, and a tiered storage strategy using EC and COS to reduce storage costs and improve performance in a cloud‑native big‑data environment.

Big Data MigrationCOSCloud Native
0 likes · 10 min read
How We Cut PBs of Waste and Optimized HDFS with Tiered Storage and Cloud Migration
Open Source Linux
Open Source Linux
Jan 27, 2021 · Fundamentals

Unlocking Ceph: A Deep Dive into Distributed Storage Architecture and Features

This article provides a comprehensive overview of Red Hat Ceph’s distributed object‑storage architecture, covering storage pools, authentication, placement groups, the CRUSH algorithm, replication, erasure coding, internal operations, high‑availability mechanisms, client interfaces, and encryption, illustrated with diagrams and practical details.

CRUSHCephReplication
0 likes · 40 min read
Unlocking Ceph: A Deep Dive into Distributed Storage Architecture and Features
Architects' Tech Alliance
Architects' Tech Alliance
Jan 25, 2021 · Fundamentals

Ceph Storage Architecture Overview and Detailed Technical Features

This article provides a comprehensive technical overview of Red Hat Ceph, covering its distributed object storage design, cluster architecture, storage pools, authentication, placement groups, CRUSH algorithm, I/O operations, replication, erasure coding, internal management tasks, high availability, client interfaces, data striping, and encryption mechanisms.

CRUSHCephData Striping
0 likes · 42 min read
Ceph Storage Architecture Overview and Detailed Technical Features
Didi Tech
Didi Tech
Jan 22, 2021 · Big Data

Erasure Coding Practice in HDFS at Didi: Principles, Implementation, and Lessons Learned

Didi migrated HDFS to Hadoop 3.2 and implemented erasure coding—using XOR and Reed‑Solomon RS(6,3) striping—to replace three‑replica storage for cold data, building back‑ported clients, automated conversion tools, and cross‑datacenter backup pipelines, while addressing operational bugs and noting performance trade‑offs.

Big DataDidiHDFS
0 likes · 11 min read
Erasure Coding Practice in HDFS at Didi: Principles, Implementation, and Lessons Learned
Architects' Tech Alliance
Architects' Tech Alliance
Nov 9, 2020 · Cloud Computing

Ceph Storage Architecture: Overview, Cluster Design, Client Interfaces, and Encryption

This article provides a comprehensive technical overview of Red Hat Ceph, covering its distributed storage architecture, cluster components, storage pool types, authentication, placement algorithms, I/O paths, replication and erasure‑coding strategies, internal management operations, high‑availability mechanisms, client libraries, data striping, and encryption details.

CRUSHCephData Striping
0 likes · 39 min read
Ceph Storage Architecture: Overview, Cluster Design, Client Interfaces, and Encryption
Didi Tech
Didi Tech
Jan 5, 2020 · Big Data

Rolling Upgrade of HDFS from 2.7 to 3.2: Experience, Issues and Solutions

The team performed a rolling upgrade of HDFS from 2.7 to 3.2 on large clusters, resolving EditLog, Fsimage, StringTable and authentication incompatibilities by omitting EC data, using fallback images, rolling back commits and first upgrading to the latest 2.x release, following a staged JournalNode‑NameNode‑DataNode procedure, validating with rehearsals and a custom trash‑management tool, and achieving uninterrupted service, improved stability, performance and cost efficiency.

Big DataCluster MigrationHDFS
0 likes · 11 min read
Rolling Upgrade of HDFS from 2.7 to 3.2: Experience, Issues and Solutions
Architects' Tech Alliance
Architects' Tech Alliance
Sep 27, 2019 · Cloud Native

MinIO Object Storage System: Architecture, Design Principles, Features, and Performance

This article provides a comprehensive technical overview of MinIO, an open‑source, S3‑compatible object storage system, covering its design philosophy, data organization, distributed architecture, erasure‑coding, lock management, lambda notifications, backup strategies, performance optimizations, and a comparative analysis with Ceph, highlighting its suitability for AI, big‑data, and cloud‑native deployments.

Cloud NativeDistributed SystemsMinio
0 likes · 22 min read
MinIO Object Storage System: Architecture, Design Principles, Features, and Performance
58 Tech
58 Tech
May 21, 2019 · Backend Development

Design and Architecture of WOS: 58 Group's Self‑Developed Object Storage System

This article presents the architecture and key design features of WOS, the 58 Group’s self‑developed object storage system, covering its overall framework, proxy, store, directory, detector modules, fast‑upload “秒传” mechanism, and erasure‑coding solution for efficient, scalable, and reliable unstructured data storage.

Backend ArchitectureWOSdistributed storage
0 likes · 12 min read
Design and Architecture of WOS: 58 Group's Self‑Developed Object Storage System
Architects' Tech Alliance
Architects' Tech Alliance
May 15, 2019 · Fundamentals

Red Hat Ceph Storage Architecture Guide – Overview and Core Concepts

This article provides a comprehensive overview of Red Hat Ceph's distributed object storage architecture, covering storage pools, CRUSH placement, authentication, I/O workflows, internal operations, client interfaces, data striping, erasure coding, high availability, and encryption mechanisms for secure, scalable deployments.

CRUSHCephdistributed storage
0 likes · 40 min read
Red Hat Ceph Storage Architecture Guide – Overview and Core Concepts
Architects' Tech Alliance
Architects' Tech Alliance
Sep 5, 2018 · Fundamentals

Red Hat Ceph Storage Architecture Overview and Key Components

This article provides a comprehensive English translation of the Red Hat Ceph Storage Architecture Guide, covering Ceph's distributed object storage concepts, cluster architecture, storage pools, CRUSH algorithm, replication and erasure‑coding I/O, internal operations, high‑availability mechanisms, client interfaces, and encryption considerations for cloud environments.

CRUSHCephReplication
0 likes · 40 min read
Red Hat Ceph Storage Architecture Overview and Key Components
dbaplus Community
dbaplus Community
Oct 25, 2017 · Big Data

Optimizing HDFS Storage with Heterogeneous Media, Erasure Coding, and Smart Storage Management

This article explains the challenges of growing data volumes, small files, and hot‑cold data in Hadoop HDFS, then details heterogeneous storage options, erasure‑coding techniques, and the open‑source SSM (Smart Storage Management) system that automates tiered storage based on data access patterns.

Data TieringHeterogeneous StorageSmart Storage Management
0 likes · 14 min read
Optimizing HDFS Storage with Heterogeneous Media, Erasure Coding, and Smart Storage Management
Meituan Technology Team
Meituan Technology Team
May 19, 2017 · Databases

Speculative Partial Writes in Erasure-Coded Storage Systems (PBS Design and Evaluation)

The paper translates a USENIX'17 study on speculative partial writes for erasure‑coded storage, introducing PBS (Parity‑Based Speculation) which logs parity changes to turn costly random writes into sequential appends, achieving write IOPS and latency comparable to triple‑replication while maintaining fast recovery and high EC encoding throughput.

Partial WritesPerformance EvaluationSpeculative Writes
0 likes · 6 min read
Speculative Partial Writes in Erasure-Coded Storage Systems (PBS Design and Evaluation)
Tencent Architect
Tencent Architect
Apr 12, 2017 · Databases

Tencent File System (TFS): Architecture, 3D Indexing, High‑Performance Key‑Value Store, and Storage Engines

The article details Tencent File System (TFS), describing its platform components, 3D indexing techniques, high‑performance key‑value storage (TSSD) with MHT, dual‑read and smooth scaling mechanisms, hybrid index storage, host‑level FTL, Append‑Only and erasure‑coding storage engines, and how these innovations deliver scalable, low‑cost, high‑performance data storage for massive workloads.

Key-ValueSSDTFS
0 likes · 12 min read
Tencent File System (TFS): Architecture, 3D Indexing, High‑Performance Key‑Value Store, and Storage Engines
Art of Distributed System Architecture Design
Art of Distributed System Architecture Design
May 30, 2015 · Industry Insights

How Facebook’s Cold Storage Cuts Power Use by 75% While Scaling Performance

Facebook’s Cold Storage system redesigns hardware and software to store rarely accessed data with up to 75% lower power consumption, modular 2U racks holding 30 disks, Reed‑Solomon erasure coding for cheap redundancy, and a self‑healing “anti‑entropy” process that improves performance as the system scales.

Data centerFacebookcold storage
0 likes · 12 min read
How Facebook’s Cold Storage Cuts Power Use by 75% While Scaling Performance
MaGe Linux Operations
MaGe Linux Operations
Jul 16, 2014 · Cloud Computing

Why Modern Cloud Storage Is Getting So Complex—and How Qiniu Solved It

From the evolution of single‑machine file systems to today’s distributed, erasure‑coded cloud storage, this article examines why storage has become increasingly complex, the limitations of traditional replication, and how Qiniu’s next‑gen architecture leverages EC, faster repairs, and cost reductions to meet scalability, reliability, and availability demands.

ReliabilityScalabilitycloud storage
0 likes · 15 min read
Why Modern Cloud Storage Is Getting So Complex—and How Qiniu Solved It