Tagged articles
265 articles
Page 1 of 3
ByteDance SE Lab
ByteDance SE Lab
Apr 17, 2026 · Industry Insights

How DisCoGC Cuts Storage Costs by 20%: A Deep Dive into ByteStore’s New GC Paradigm

This article analyzes the DisCoGC algorithm introduced by ByteDance, explaining how its discard‑centric garbage collection eliminates the write‑amplification vs. space‑amplification trade‑off in log‑structured storage, details the engineering challenges of multi‑layer deployment, and presents production results showing up to 20% TCO reduction without impacting latency.

Cost reductionGarbage CollectionPerformance Optimization
0 likes · 19 min read
How DisCoGC Cuts Storage Costs by 20%: A Deep Dive into ByteStore’s New GC Paradigm
Deepin Linux
Deepin Linux
Mar 6, 2026 · Backend Development

Unlocking Ultra‑Low Latency: How RDMA Transforms High‑Performance Networking

This article explains the fundamentals of Remote Direct Memory Access (RDMA), its low‑latency, zero‑copy and kernel‑bypass mechanisms, programming interfaces, and real‑world applications in data‑center networks, high‑performance computing, and distributed storage, providing developers with practical guidance and code examples.

High‑performance computingLow latencyNetwork programming
0 likes · 31 min read
Unlocking Ultra‑Low Latency: How RDMA Transforms High‑Performance Networking
macrozheng
macrozheng
Mar 3, 2026 · Backend Development

Explore Free FS: A Modern Spring Boot File Management System with Full Stack Demo

Free FS is an enterprise‑grade, Spring Boot‑based file management system that offers high‑performance storage, plug‑in architecture, and real‑time progress, with detailed backend and frontend installation guides, multi‑format preview, sharing, and support for various storage backends, all available via its public Git repository.

Backend DevelopmentSpring Bootdistributed storage
0 likes · 6 min read
Explore Free FS: A Modern Spring Boot File Management System with Full Stack Demo
Baidu Geek Talk
Baidu Geek Talk
Feb 9, 2026 · Databases

How Mantle Redefined Cloud Object Storage Metadata for Billion‑File Scale

This article recounts how Baidu's storage team tackled the performance and scalability limits of traditional object storage by redesigning metadata handling with the Mantle and MantleX architectures, introducing a centralized IndexNode, strong consistency, delta‑record writes, and a seamless single‑node to distributed transition for massive file systems.

FilesystemPerformance OptimizationScalability
0 likes · 37 min read
How Mantle Redefined Cloud Object Storage Metadata for Billion‑File Scale
Raymond Ops
Raymond Ops
Jan 19, 2026 · Operations

Master Ceph: Complete Guide to Deploying and Managing a Production-Ready Distributed Storage Cluster

This comprehensive guide explains why Ceph is a leading software‑defined storage solution, details hardware and network design, walks through step‑by‑step deployment with cephadm, covers pool creation, monitoring, performance tuning, troubleshooting, scaling, backup, security hardening, and advanced automation for production environments.

CephCluster DeploymentLinux
0 likes · 15 min read
Master Ceph: Complete Guide to Deploying and Managing a Production-Ready Distributed Storage Cluster
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 29, 2025 · Artificial Intelligence

How Alibaba’s Tair KVCache Manager Revolutionizes Enterprise‑Level LLM Cache Management

This article details the architecture and implementation of Tair KVCache Manager, an enterprise‑grade service that centralises KVCache metadata, decouples inference engines from storage, provides elastic scaling, multi‑tenant isolation, high availability, and performance‑optimised cache management for large‑scale LLM inference workloads.

Cache ManagementKVCacheLLM
0 likes · 28 min read
How Alibaba’s Tair KVCache Manager Revolutionizes Enterprise‑Level LLM Cache Management
Java Companion
Java Companion
Dec 27, 2025 · Cloud Native

Is Minio Turning Paid? 5 Free Distributed Storage Alternatives You Should Consider

The article explains Minio's recent licensing shift to AGPLv3, why it matters for SaaS and proprietary software vendors, and presents five open‑source distributed storage systems—SeaweedFS, Garage, Ceph, GlusterFS, and OpenStack Swift—detailing their licenses, deployment complexity, performance characteristics, and suitable use cases.

CephGlusterFSMinio
0 likes · 21 min read
Is Minio Turning Paid? 5 Free Distributed Storage Alternatives You Should Consider
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 17, 2025 · Cloud Native

How 3FS Powers High‑Performance KVCache for AI Inference: Architecture, Optimizations, and Cloud‑Native Deployment

This article details the design and engineering of the 3FS distributed file system as a scalable KVCache backend for large‑language‑model inference, covering its architecture, performance tuning, reliability fixes, integration with SGLang/vLLM, and cloud‑native Kubernetes operator deployment.

3FSAI inferenceCloud Native
0 likes · 30 min read
How 3FS Powers High‑Performance KVCache for AI Inference: Architecture, Optimizations, and Cloud‑Native Deployment
Raymond Ops
Raymond Ops
Dec 7, 2025 · Operations

Ceph Uncovered: Architecture, Deployment, and Ops Best Practices

Ceph is an open‑source distributed storage platform offering object, block, and file services with high availability, scalability, and self‑management; the guide explains its core components, CRUSH algorithm, storage interfaces, deployment steps using ceph‑deploy, operational monitoring, performance tuning, and common use cases in cloud and big‑data environments.

Big DataCephDeployment
0 likes · 11 min read
Ceph Uncovered: Architecture, Deployment, and Ops Best Practices
Ray's Galactic Tech
Ray's Galactic Tech
Nov 30, 2025 · Cloud Native

Mastering etcd: The Core of Kubernetes State Management and High‑Availability

etcd is the distributed, strongly consistent key‑value store that serves as Kubernetes' single source of truth, handling all cluster state data; this guide explains its architecture, data model, watch mechanism, high‑availability deployment, backup, monitoring, security, and operational best practices for reliable cluster management.

Kubernetesdistributed storageetcd
0 likes · 8 min read
Mastering etcd: The Core of Kubernetes State Management and High‑Availability
Instant Consumer Technology Team
Instant Consumer Technology Team
Nov 4, 2025 · Backend Development

How a Concurrent Append‑Only Architecture Doubles Storage Performance

This article examines the design and implementation of a proprietary concurrent Append‑Only distributed object storage system, detailing its unified persistence layer, heavyweight client, hardware optimizations, metadata simplification, flexible redundancy, high availability, and real‑world performance gains across big‑data, AI, and log‑archiving workloads.

append-onlydistributed storagehigh performance
0 likes · 20 min read
How a Concurrent Append‑Only Architecture Doubles Storage Performance
Tech Freedom Circle
Tech Freedom Circle
Oct 23, 2025 · Databases

Why Consistent Hashing Fails: Why Redis, HBase, TiDB and Ceph Have Dropped It

The article examines the fundamental limitations of consistent hashing—its inability to preserve data locality, support range queries, and handle topology awareness—explaining why major storage systems such as Redis Cluster, TiDB, Ceph, and HBase have adopted alternative sharding strategies like hash slots, range partitioning, and CRUSH.

CRUSHCephHBase
0 likes · 45 min read
Why Consistent Hashing Fails: Why Redis, HBase, TiDB and Ceph Have Dropped It
Mike Chen's Internet Architecture
Mike Chen's Internet Architecture
Aug 29, 2025 · Fundamentals

Understanding Distributed Storage: HDFS, CephFS, GlusterFS, and FastDFS Compared

This article compares four major distributed storage solutions—HDFS, CephFS, GlusterFS, and FastDFS—detailing their architectures, strengths, weaknesses, and ideal use cases for big‑data processing, cloud-native environments, and high‑concurrency file services, and how they fit into modern infrastructure strategies.

Big DataCephFSFastDFS
0 likes · 5 min read
Understanding Distributed Storage: HDFS, CephFS, GlusterFS, and FastDFS Compared
MaGe Linux Operations
MaGe Linux Operations
Aug 27, 2025 · Operations

Master Ceph on Linux: Complete Guide to Deploying and Managing a Production-Ready Cluster

This comprehensive guide walks you through the fundamentals of Ceph, hardware recommendations, network design, step‑by‑step deployment with cephadm, storage pool configuration, performance tuning, troubleshooting, scaling, backup, security hardening, and automation scripts for production‑grade Linux clusters.

CephCluster Deploymentdistributed storage
0 likes · 16 min read
Master Ceph on Linux: Complete Guide to Deploying and Managing a Production-Ready Cluster
macrozheng
macrozheng
Jul 31, 2025 · Cloud Computing

Unlock RustFS: High‑Performance Distributed Storage with Docker & SpringBoot

This guide introduces RustFS, a high‑performance Rust‑based distributed object storage system, covering its key features, Docker installation, console usage, and step‑by‑step integration with SpringBoot for file upload and deletion, including code snippets and configuration details.

DockerRustFSS3 Compatibility
0 likes · 11 min read
Unlock RustFS: High‑Performance Distributed Storage with Docker & SpringBoot
JD Cloud Developers
JD Cloud Developers
Jul 16, 2025 · Databases

How JD Ads Cut Storage Costs 87% with Apache Doris Hot‑Cold Tiering

This article details JD Advertising's journey from a 1 PB Apache Doris data lake to a multi‑level hot‑cold tiering architecture, describing two tiering strategies, the performance and schema‑change challenges faced during the upgrade to Doris 2.0, and the optimizations that reduced storage costs by about 87% while boosting query throughput.

Apache DorisSchema Changecold data
0 likes · 19 min read
How JD Ads Cut Storage Costs 87% with Apache Doris Hot‑Cold Tiering
MaGe Linux Operations
MaGe Linux Operations
Jul 11, 2025 · Fundamentals

Mastering Ceph: A Deep Dive into Distributed Storage Architecture and Operations

This article provides a comprehensive overview of the open‑source Ceph distributed storage system, covering its core features, architecture components, data placement algorithms, storage interfaces, deployment best practices, operational management, and real‑world use cases for cloud, big data, and backup scenarios.

CephData Managementcloud computing
0 likes · 11 min read
Mastering Ceph: A Deep Dive into Distributed Storage Architecture and Operations
Su San Talks Tech
Su San Talks Tech
Jun 28, 2025 · Backend Development

Essential Microservice Architecture Components: From Nginx to Distributed Storage

This article outlines the key building blocks of a microservice architecture—including Nginx as the traffic entry, Spring Cloud Gateway, service registries, Redis caching, MySQL persistence, Elasticsearch, message queues, ELK logging, distributed schedulers, and object storage—explaining their roles, deployment patterns, and common technology choices.

Backend ArchitectureNGINXdistributed cache
0 likes · 10 min read
Essential Microservice Architecture Components: From Nginx to Distributed Storage
Ops Community
Ops Community
Jun 21, 2025 · Operations

Master Ceph: The Ultimate Distributed Storage Operations Handbook

This guide introduces Ceph as a leading open‑source distributed storage solution, explains why enterprises choose it for scalable data platforms, and provides a comprehensive operations manual covering common tasks, troubleshooting, and advanced management to help storage engineers efficiently run Ceph clusters.

CephStorage Managementdistributed storage
0 likes · 3 min read
Master Ceph: The Ultimate Distributed Storage Operations Handbook
dbaplus Community
dbaplus Community
Jun 8, 2025 · Databases

How NeighborHash Boosts Real‑Time Recommendation Queries with Low Latency

To meet the ultra‑low latency demands of modern recommendation systems, the authors designed a distributed batch‑query architecture featuring the NeighborHash optimization—a cache‑line‑aware hash table that reduces memory accesses, combined with NVMe‑backed storage and AMAC techniques, achieving high throughput and near‑optimal bandwidth utilization.

NVMebatch querydistributed storage
0 likes · 19 min read
How NeighborHash Boosts Real‑Time Recommendation Queries with Low Latency
Linux Ops Smart Journey
Linux Ops Smart Journey
Jun 4, 2025 · Cloud Native

Deploy Longhorn on Kubernetes with Helm: Step‑by‑Step Guide

This article provides a comprehensive, hands‑on tutorial for deploying the open‑source Longhorn distributed block storage system on a Kubernetes cluster using Helm, covering prerequisites, Helm chart preparation, installation, validation, and PVC mounting to ensure reliable stateful workloads.

KubernetesLonghornPersistent Volumes
0 likes · 11 min read
Deploy Longhorn on Kubernetes with Helm: Step‑by‑Step Guide
MaGe Linux Operations
MaGe Linux Operations
Jun 2, 2025 · Operations

How to Deploy a High‑Availability MinIO Distributed Cluster on Rocky 9

This guide walks you through deploying a highly available MinIO distributed object storage cluster on Rocky 9, covering prerequisites, environment preparation, user and directory setup, configuration files, systemd service creation, testing, Nginx load balancing, and verification of cluster health.

Miniodistributed storagehigh availability
0 likes · 20 min read
How to Deploy a High‑Availability MinIO Distributed Cluster on Rocky 9
Liangxu Linux
Liangxu Linux
May 22, 2025 · Cloud Computing

Master Ceph: Step‑by‑Step Guide to Deploy a Scalable Distributed Storage Cluster

Learn how to design, configure, and deploy a Ceph distributed storage cluster using ceph‑deploy, covering storage fundamentals, Ceph architecture, component roles, network planning, OS preparation, mon, mgr, osd setup, and dashboard activation, with detailed commands and best‑practice recommendations for production environments.

CephDashboarddistributed storage
0 likes · 28 min read
Master Ceph: Step‑by‑Step Guide to Deploy a Scalable Distributed Storage Cluster
Architects' Tech Alliance
Architects' Tech Alliance
Apr 26, 2025 · Industry Insights

Why Distributed Storage Is the Next Backbone of the Digital Economy

This article analyzes the evolution of distributed storage—from traditional compute‑storage separation to edge‑centric, AI‑enabled architectures—covering service models, key technologies such as CXL and erasure coding, reliability strategies, performance optimizations, vendor landscapes, and emerging green and intelligent trends.

AI storageCXLNVMe-oF
0 likes · 13 min read
Why Distributed Storage Is the Next Backbone of the Digital Economy
Linux Cloud Computing Practice
Linux Cloud Computing Practice
Apr 10, 2025 · Cloud Computing

Unlock Scalable, Reliable Storage: A Complete Guide to Deploying Ceph

This article provides a comprehensive overview of Ceph distributed storage, covering storage fundamentals, Ceph architecture, advantages, version lifecycle, and step‑by‑step deployment using ceph‑deploy, including environment preparation, monitor and OSD setup, manager configuration, and dashboard activation.

CephCluster ManagementDashboard
0 likes · 28 min read
Unlock Scalable, Reliable Storage: A Complete Guide to Deploying Ceph
AntData
AntData
Mar 14, 2025 · Fundamentals

Analysis of DeepSeek 3FS Storage Service Architecture and Design

This article provides an in‑depth technical analysis of DeepSeek's open‑source 3FS distributed file system, focusing on the StorageService architecture, space pooling, allocation mechanisms, reference counting, fragmentation handling, and the RDMA‑based read/write data path.

RDMAZero Copyallocation
0 likes · 15 min read
Analysis of DeepSeek 3FS Storage Service Architecture and Design
Linux Kernel Journey
Linux Kernel Journey
Feb 27, 2025 · Cloud Native

Designing FUSE: From Kernel VFS to Userspace and JuiceFS Performance

This article explains the evolution of file system architecture from kernel‑level VFS to userspace via FUSE, reviews the historical role of NFS, details JuiceFS's implementation on top of FUSE, and presents benchmark results that demonstrate its high throughput and practical limitations.

FUSEJuiceFSLinux kernel
0 likes · 15 min read
Designing FUSE: From Kernel VFS to Userspace and JuiceFS Performance
Deepin Linux
Deepin Linux
Feb 23, 2025 · Cloud Computing

Understanding Ceph Distributed Storage Architecture and Its Core Components

Ceph is a unified, open‑source distributed storage system whose layered architecture—comprising RADOS, LIBRADOS, and upper‑level services like RADOSGW, RBD, and CephFS—provides high performance, reliability, scalability, and flexible data access for cloud, big‑data, and AI workloads.

Big DataCepharchitecture
0 likes · 25 min read
Understanding Ceph Distributed Storage Architecture and Its Core Components
Bilibili Tech
Bilibili Tech
Jan 17, 2025 · Backend Development

NeighborHash: An Enhanced Batch Query Architecture for Real‑time Recommendation Systems

NeighborHash is a distributed batch‑query architecture for real‑time recommendation systems that combines a cache‑line‑optimized hash table—featuring Lodger Relocation, bidirectional cache‑aware probing, and inline‑chaining—with an NVMe‑backed key‑value service, versioned updates, and asynchronous memory‑access chaining to achieve sub‑microsecond, high‑throughput top‑N retrieval.

AMACNVMePerformance Optimization
0 likes · 20 min read
NeighborHash: An Enhanced Batch Query Architecture for Real‑time Recommendation Systems
JD Tech
JD Tech
Dec 26, 2024 · Databases

Optimizing Query Performance for JD's BIP Procurement System with JED, JimKV, and Elasticsearch

This article details how JD's BIP procurement system tackled massive query‑performance challenges by segmenting order data, leveraging the JED distributed MySQL solution, introducing JimKV for hot‑data caching, and offloading complex searches to Elasticsearch, resulting in dramatically reduced load and faster user experiences.

Big DataDatabase OptimizationElasticsearch
0 likes · 11 min read
Optimizing Query Performance for JD's BIP Procurement System with JED, JimKV, and Elasticsearch
JD Retail Technology
JD Retail Technology
Dec 12, 2024 · Databases

Optimizing Query Performance for JD's BIP Procurement System with JED, JimKV, and Elasticsearch

This article presents a comprehensive case study of how JD's procurement system (BIP) tackled massive data volume and complex query challenges by redesigning data models, introducing heterogeneous storage for inbound orders, leveraging JED and JimKV, and offloading complex searches to Elasticsearch, resulting in dramatically reduced database load and improved user experience.

Database OptimizationElasticsearchJD supply chain
0 likes · 11 min read
Optimizing Query Performance for JD's BIP Procurement System with JED, JimKV, and Elasticsearch
Open Source Tech Hub
Open Source Tech Hub
Oct 31, 2024 · Big Data

How Bilibili Scaled Its Search Index with Distributed KV Storage and Spark

Bilibili transformed its search indexing pipeline by replacing a manual, low‑throughput process with a distributed KV store (Taishan) and Spark‑based construction, achieving unified data ingestion, reduced resource consumption, faster full‑ and incremental builds, and a shift from daily to hourly indexing cycles.

Big DataKV StoreProtobuf
0 likes · 25 min read
How Bilibili Scaled Its Search Index with Distributed KV Storage and Spark
dbaplus Community
dbaplus Community
Sep 4, 2024 · Big Data

How Ctrip Scaled Its Data Platform to Multi‑IDC Architecture with Spark 3, Kyuubi, and Celeborn

This article details how Ctrip’s data platform evolved from a single‑IDC design to a multi‑IDC, tiered storage and scheduling architecture, covering the challenges of rapid data growth, the migration to Spark 3 via Kyuubi, the introduction of Celeborn shuffle service, and the resulting performance and reliability gains.

Big DataHDFSKyuubi
0 likes · 23 min read
How Ctrip Scaled Its Data Platform to Multi‑IDC Architecture with Spark 3, Kyuubi, and Celeborn
DevOps Operations Practice
DevOps Operations Practice
Aug 15, 2024 · Cloud Native

Five Best Open-Source Kubernetes Storage Solutions

This article reviews five leading open‑source storage solutions for Kubernetes—OpenEBS, Rook, GlusterFS, Ceph, and LongHorn—detailing their architectures, key features, and ideal use‑cases to help readers select the most appropriate storage option for various application requirements.

Kubernetesdistributed storage
0 likes · 6 min read
Five Best Open-Source Kubernetes Storage Solutions
Bilibili Tech
Bilibili Tech
Aug 13, 2024 · Big Data

How Bilibili Re‑engineered Its Search Indexing with Distributed Storage and Spark

This article details Bilibili's transformation of its search offline indexing pipeline, moving from manual MySQL‑based processes to a high‑capacity, distributed KV store and Spark‑driven builds, addressing performance, maintenance, and scalability challenges while improving resource efficiency and iteration speed.

Big DataBilibiliKV Store
0 likes · 24 min read
How Bilibili Re‑engineered Its Search Indexing with Distributed Storage and Spark
Architects' Tech Alliance
Architects' Tech Alliance
Jun 11, 2024 · Industry Insights

Why Traditional Distributed Storage Struggles and How New Compute‑Storage Separation Can Transform Cloud Data Centers

The article analyzes the limitations of current server‑based distributed storage—such as data‑lifecycle mismatches, performance‑resource trade‑offs, serverless workload demands, and the costly "datacenter tax"—and presents emerging hardware trends and a novel compute‑storage separation architecture that promises higher efficiency, reliability, and scalability for cloud and internet data centers.

CXLCompute-Storage SeparationDPU
0 likes · 13 min read
Why Traditional Distributed Storage Struggles and How New Compute‑Storage Separation Can Transform Cloud Data Centers
DataFunTalk
DataFunTalk
May 27, 2024 · Big Data

JD Retail’s Unified HDFS Storage: Cross‑Region and Hierarchical Storage Practices

This article details JD Retail’s large‑scale HDFS deployment, describing how cross‑region storage challenges were solved with a full‑copy topology, asynchronous block replication, flow‑control mechanisms, and a tiered storage strategy that automatically moves hot, warm, and cold data among SSD, HDD, and high‑density HDD nodes to improve performance and cut costs.

Big DataData ManagementHDFS
0 likes · 20 min read
JD Retail’s Unified HDFS Storage: Cross‑Region and Hierarchical Storage Practices
DataFunTalk
DataFunTalk
May 21, 2024 · Big Data

Applying Alluxio to Autonomous Driving Model Training: Deployment, Performance, and Operational Insights

This article details how Alluxio was adopted to replace NAS in autonomous driving model training, describing the data closed‑loop workflow, the challenges of the previous system, Alluxio's architectural benefits, deployment strategies across single and multiple data centers, functional and performance testing, operational tuning, and the resulting cost and efficiency gains.

AlluxioModel TrainingPerformance Optimization
0 likes · 15 min read
Applying Alluxio to Autonomous Driving Model Training: Deployment, Performance, and Operational Insights
Ops Development Stories
Ops Development Stories
Apr 12, 2024 · Cloud Native

Mastering etcd: Architecture, Monitoring & Performance Tuning

This article provides a comprehensive overview of etcd—including its origins, role in Kubernetes, version evolution, layered architecture, key terminology, operational commands, monitoring metrics, benchmarking procedures, disk‑performance testing, and tuning recommendations—for building reliable cloud‑native clusters.

Benchmarkcloud-nativedistributed storage
0 likes · 17 min read
Mastering etcd: Architecture, Monitoring & Performance Tuning
Sohu Tech Products
Sohu Tech Products
Mar 13, 2024 · Databases

DingoDB Multi-Modal Vector Database: Design Philosophy, Architecture and Applications

DingoDB is a multi‑modal vector database that unifies storage and analysis of structured, semi‑structured and unstructured data through a Raft‑based distributed architecture, offering MySQL‑compatible SQL, high‑performance APIs, automatic sharding, real‑time index optimization, and hybrid scalar‑vector queries for enterprise knowledge bases, LLM memory, and real‑time decision‑making.

Data ArchitectureDingoDBLLM applications
0 likes · 11 min read
DingoDB Multi-Modal Vector Database: Design Philosophy, Architecture and Applications
Volcano Engine Developer Services
Volcano Engine Developer Services
Feb 22, 2024 · Cloud Native

How BMQ’s Cloud‑Native Compute‑Storage Separation Revolutionizes Message Queues

This article explains how ByteDance’s BMQ, a cloud‑native message engine with a compute‑storage separated architecture, overcomes Kafka’s scalability and operational limits by using Proxy, Broker, Coordinator, and Controller modules, a distributed storage model, and advanced caching to achieve rapid scaling, high throughput, and resilient operations.

Cloud NativeMessage QueueOperations
0 likes · 15 min read
How BMQ’s Cloud‑Native Compute‑Storage Separation Revolutionizes Message Queues
DataFunSummit
DataFunSummit
Feb 6, 2024 · Big Data

Exploring ByteDance's EB‑Scale HDFS: Architecture, Multi‑Datacenter Challenges, Tiered Storage, and Data Protection Practices

This article presents an in‑depth overview of ByteDance's EB‑scale HDFS, covering its new features, multi‑datacenter architecture, tiered storage implementation, data management services, capacity and fault‑tolerance strategies, as well as practical data‑protection mechanisms and related Q&A.

Big DataData ProtectionHDFS
0 likes · 22 min read
Exploring ByteDance's EB‑Scale HDFS: Architecture, Multi‑Datacenter Challenges, Tiered Storage, and Data Protection Practices
dbaplus Community
dbaplus Community
Jan 28, 2024 · Databases

How ByteGraph 3.0 Redefines Scalable Graph Database Architecture

This article presents a comprehensive technical overview of ByteGraph, covering its evolution from 2.0 to 3.0, core graph modeling capabilities, Gremlin query interface, architectural layers, performance bottlenecks, cost‑reduction strategies, and future roadmap for large‑scale graph data services.

ByteGraphGremlinarchitecture
0 likes · 20 min read
How ByteGraph 3.0 Redefines Scalable Graph Database Architecture

Exploring Container Federation: Multi‑Cluster Management with FOOT V3.5

This article examines the challenges of managing multiple Kubernetes clusters, outlines key business pain points, reviews open‑source federation solutions, and details the FOOT V3.5 platform’s architecture—including hub‑cluster design, push/pull registration, application policies, APISIX gateway integration, and Ceph‑based distributed storage—while also looking ahead to AI, edge, and security trends.

APISIXFOOT platformKubernetes
0 likes · 18 min read
Exploring Container Federation: Multi‑Cluster Management with FOOT V3.5
Su San Talks Tech
Su San Talks Tech
Oct 29, 2023 · Operations

What Are the Best Distributed File Storage Systems and How to Choose One?

This article introduces the concept of distributed storage, outlines its key advantages, reviews major distributed file systems such as GFS, HDFS, Ceph, Lustre, TFS, FastDFS, and GridFS, explains POSIX basics, and provides practical criteria for selecting the most suitable system for different workloads.

CephHDFSSelection Guide
0 likes · 12 min read
What Are the Best Distributed File Storage Systems and How to Choose One?
php Courses
php Courses
Sep 28, 2023 · Backend Development

Implementing Distributed Data Storage and Retrieval with PHP Microservices

This article explains the challenges of traditional single-node data storage, introduces microservice architecture, and provides step-by-step PHP Swoole code examples for creating storage and retrieval microservices and a client script, demonstrating how to achieve scalable, fault‑tolerant distributed data storage and retrieval.

Backend DevelopmentMicroservicesPHP
0 likes · 5 min read
Implementing Distributed Data Storage and Retrieval with PHP Microservices
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Sep 4, 2023 · Big Data

How Baidu’s Aries Cloud Storage Leverages Tape Libraries for Massive Cold Data Archiving

This article explains Baidu Intelligent Cloud’s tape‑library based cold‑data storage architecture, covering tape media basics, the Aries cloud storage system, its modular design, data flow, write and retrieval processes, and a real‑world deployment case that demonstrates cost‑effective petabyte‑scale archival.

ariescloud storagecold data
0 likes · 31 min read
How Baidu’s Aries Cloud Storage Leverages Tape Libraries for Massive Cold Data Archiving
DataFunSummit
DataFunSummit
Jul 6, 2023 · Big Data

Design and Practice of Alibaba Cloud's Billion‑Scale Real‑Time Log Analysis

This article presents Alibaba Cloud's SLS billion‑scale real‑time log analysis architecture, covering business background, core challenges such as low‑latency queries, massive data scale, high concurrency, and multi‑tenant isolation, and detailing key design solutions like LSM‑based storage, index‑columnar storage, data locality, layered caching, and future directions.

Big Datadistributed storagehigh concurrency
0 likes · 17 min read
Design and Practice of Alibaba Cloud's Billion‑Scale Real‑Time Log Analysis
Architect
Architect
Jul 5, 2023 · Backend Development

Evolution of Bilibili's Relationship Chain Service: From MySQL to KV Storage, Multi‑Layer Caching, and Hotspot Resilience

The article details how Bilibili's relationship‑chain service scaled from a MySQL‑based design to a distributed KV store, introduced successive caching layers (memcached, Redis hash, Redis KV, bloom filter), and implemented hotspot mitigation techniques to sustain near‑million QPS traffic while ensuring high availability and data consistency.

Backend Architecturedistributed storagemysql
0 likes · 16 min read
Evolution of Bilibili's Relationship Chain Service: From MySQL to KV Storage, Multi‑Layer Caching, and Hotspot Resilience
Open Source Linux
Open Source Linux
Jun 19, 2023 · Fundamentals

Why Dual‑RAID Beats Triple‑Replication in Distributed Storage

This article compares triple‑replication and dual‑RAID architectures for distributed storage, outlining the performance, reliability and operational drawbacks of triple‑replication and demonstrating how dual‑RAID’s local RAID plus two‑copy strategy delivers better bandwidth usage, fault isolation, and near‑all‑flash performance.

Cephcloud storagedistributed storage
0 likes · 6 min read
Why Dual‑RAID Beats Triple‑Replication in Distributed Storage
vivo Internet Technology
vivo Internet Technology
Jun 7, 2023 · Big Data

Erasure Coding Technology in the Evolution of Vivo Storage Systems

Combining academic advances and industry practice, the article surveys erasure‑coding techniques, then details Vivo’s optimized storage stack—enhancing Reed‑Solomon with bit‑matrix scheduling, parallel cross‑AZ repair, LRC and MSR layers, and intermediate‑result optimization—to achieve high reliability while minimizing bandwidth and storage overhead.

Regenerating CodesReliabilitydata redundancy
0 likes · 48 min read
Erasure Coding Technology in the Evolution of Vivo Storage Systems
Bilibili Tech
Bilibili Tech
Jun 6, 2023 · Backend Development

Evolution of Bilibili Relationship Chain Service: From MySQL to KV Storage and Multi‑Layer Caching

Bilibili’s relationship‑chain service, which handles follows, blacklists, whispers and mutual follows, migrated from a single sharded MySQL instance to an internal distributed KV store and introduced a three‑tier cache (memcached, Redis and a Bloom filter) plus automated hotspot routing, achieving near‑million QPS, lower latency, and preparing for multi‑tenant reuse.

KV Storedistributed storagemysql
0 likes · 17 min read
Evolution of Bilibili Relationship Chain Service: From MySQL to KV Storage and Multi‑Layer Caching
dbaplus Community
dbaplus Community
May 21, 2023 · Big Data

How Cloud Migration Transforms Big Data Architecture: Lessons from G‑Line

This article examines the limitations of traditional physical‑server Hadoop clusters and explains how adopting cloud‑native technologies, distributed object storage, and compute‑storage separation can improve resource utilization, disaster recovery, performance, security, observability, and cost efficiency for large‑scale big data workloads.

Hadoopcloud migrationdistributed storage
0 likes · 12 min read
How Cloud Migration Transforms Big Data Architecture: Lessons from G‑Line
Architects' Tech Alliance
Architects' Tech Alliance
May 12, 2023 · Industry Insights

From Hyper‑Converged Infrastructure to Hybrid Cloud: Evolution, Benefits, and Challenges

This article analyzes the development of hyper‑converged infrastructure (HCI), its relationship with software‑defined storage, the shift toward hybrid cloud architectures, and the technical considerations for adopting SDS appliances in enterprise environments.

Hyper-Converged InfrastructureSDS appliancesSoftware-Defined Storage
0 likes · 12 min read
From Hyper‑Converged Infrastructure to Hybrid Cloud: Evolution, Benefits, and Challenges
dbaplus Community
dbaplus Community
May 10, 2023 · Backend Development

How Bilibili Scaled Its KV Store to Handle Billions of Requests

This article explains how Bilibili’s KV storage system evolved from early Redis/Memcache solutions to a custom distributed KV architecture, detailing design goals, architecture components, shard management, Raft replication, multi‑active disaster recovery, and the operational challenges solved to support massive traffic growth.

BilibiliKV StoreRaft
0 likes · 14 min read
How Bilibili Scaled Its KV Store to Handle Billions of Requests
vivo Internet Technology
vivo Internet Technology
May 10, 2023 · Databases

Design and Optimization of a Disk‑Based KV Store Compatible with Redis on TiKV

The article details a Redis‑compatible, disk‑based KV service built atop TiKV using a compute‑storage split (Tula), describes custom key encoding and expiration mechanisms, and explains four optimization stages that introduce slot‑based hashing and adaptive concurrency to dramatically cut garbage‑collection latency while preserving write performance.

Database OptimizationGarbage CollectionKV Store
0 likes · 21 min read
Design and Optimization of a Disk‑Based KV Store Compatible with Redis on TiKV
ITPUB
ITPUB
Apr 13, 2023 · Databases

Inside ByteGraph: How ByteDance Scales Distributed Graph Databases with Index and Execution Optimizations

This article summarizes ByteDance engineer Chen Chao's DTCC 2022 talk on ByteGraph, covering its purpose, Gremlin query interface, three‑layer architecture, indexing strategies, distributed transaction handling, performance optimizations such as adaptive throttling and write‑amplification reduction, and integration with offline data pipelines.

ByteGraphGremlindistributed storage
0 likes · 18 min read
Inside ByteGraph: How ByteDance Scales Distributed Graph Databases with Index and Execution Optimizations
AntTech
AntTech
Mar 7, 2023 · Databases

CeresDB 1.0 Release: Cloud‑Native Time‑Series Database Design, Features, and Performance Evaluation

CeresDB 1.0, the open‑source cloud‑native time‑series database from Ant Group, introduces a next‑generation architecture that supports both traditional and analytical workloads, offers column‑mixed storage, distributed compute‑storage separation, multi‑language SDKs, and demonstrates significant write and query performance gains over InfluxDB in benchmark tests.

CeresDBcloud-nativedistributed storage
0 likes · 9 min read
CeresDB 1.0 Release: Cloud‑Native Time‑Series Database Design, Features, and Performance Evaluation
ITPUB
ITPUB
Feb 3, 2023 · Databases

How KGraph Enables Billion‑Scale Graph Processing for Social and E‑Commerce Recommendations

KGraph, developed by Kuaishou since late 2019, is a self‑built graph platform that supports massive social, e‑commerce, and security workloads, offering a distributed KV storage, high‑performance RPC framework, and advanced graph modeling to achieve tens of millions of QPS and low latency for real‑time recommendation and offline graph analytics.

KGraphdistributed storagegraph database
0 likes · 20 min read
How KGraph Enables Billion‑Scale Graph Processing for Social and E‑Commerce Recommendations
Bilibili Tech
Bilibili Tech
Feb 3, 2023 · Backend Development

Design and Architecture of Bilibili's Thumb-up Service

The article details Bilibili’s thumb‑up service architecture, covering required business and platform capabilities, handling high read/write traffic and hot‑spot pressures, a three‑tier storage design using TiDB, Redis cache and local heap cache, disaster‑recovery mechanisms, asynchronous jobs, and future modularization plans.

BackendBilibilidistributed storage
0 likes · 14 min read
Design and Architecture of Bilibili's Thumb-up Service
DataFunTalk
DataFunTalk
Jan 23, 2023 · Databases

KGraph: Architecture, Performance, and Applications of Kuaishou's In‑House Graph Platform

This article introduces KGraph, Kuaishou's self‑developed graph platform, detailing its directed heterogeneous property‑graph model, distributed KV storage with PMem persistence, high‑performance RPC framework, key challenges it solves, benchmark results, real‑time recommendation use cases, and future development directions.

KGraphdistributed storagee-commerce recommendation
0 likes · 16 min read
KGraph: Architecture, Performance, and Applications of Kuaishou's In‑House Graph Platform
MaGe Linux Operations
MaGe Linux Operations
Jan 6, 2023 · Cloud Computing

Mastering Ceph: From Overview to Cluster Deployment and Management

This comprehensive guide explains Ceph's background, architecture, key features, terminology, step‑by‑step cluster deployment on CentOS, dashboard setup, CephFS creation, and a Java client example, providing everything needed to build and operate a reliable distributed storage system.

CephCephFSCluster Deployment
0 likes · 19 min read
Mastering Ceph: From Overview to Cluster Deployment and Management
Zuoyebang Tech Team
Zuoyebang Tech Team
Nov 25, 2022 · Databases

Cut Storage Costs 400%: Inside BitalosDB’s High‑Performance KV Engine

An in‑depth look at BitalosDB, the home‑grown NoSQL storage engine behind Zuoyebang’s massive KV traffic, covering its novel IO architecture, KV‑separation design, Raft‑based consistency, multi‑cloud CRDT replication, and benchmark results that show up to 400% cost savings versus standard Redis.

CRDTDatabase PerformanceKV storage
0 likes · 11 min read
Cut Storage Costs 400%: Inside BitalosDB’s High‑Performance KV Engine
Architects' Tech Alliance
Architects' Tech Alliance
Oct 31, 2022 · Industry Insights

What Drives Distributed Storage: Product Forms, Ecosystem, and Key Use Cases

Distributed storage encompasses integrated appliances and pure‑software solutions, each with distinct hardware strategies, and forms a multi‑dimensional industry ecosystem that spans commercial and open‑source software, specialized and generic hardware, serving critical scenarios such as virtualization/cloud, high‑performance computing, and big‑data analytics.

Big DataHigh‑performance computingIndustry analysis
0 likes · 15 min read
What Drives Distributed Storage: Product Forms, Ecosystem, and Key Use Cases
Python Crawling & Data Mining
Python Crawling & Data Mining
Oct 30, 2022 · Big Data

Why Ozone Is the Next‑Generation Distributed Object Store for Big Data

This article explains how Ozone, the Hadoop community’s new distributed object‑storage system, overcomes HDFS’s small‑file limitations with a hierarchical Volume‑Bucket‑Object model, detailing its architecture, components, data flow for creating and reading objects, and the benefits of its scalable, fault‑tolerant design.

Big DataHadoopOzone
0 likes · 12 min read
Why Ozone Is the Next‑Generation Distributed Object Store for Big Data
DataFunSummit
DataFunSummit
Oct 14, 2022 · Databases

ByteGraph: ByteDance’s In‑house Graph Database Architecture and Implementation

ByteGraph is ByteDance’s internally developed graph database that stores and queries massive graph data efficiently, featuring a three‑layer architecture of query engine, storage engine, and disk storage, supporting Gremlin, partitioning, indexing, caching, high availability, and integration with online/offline data pipelines.

ByteGraphGremlindistributed storage
0 likes · 12 min read
ByteGraph: ByteDance’s In‑house Graph Database Architecture and Implementation
DataFunTalk
DataFunTalk
Oct 8, 2022 · Backend Development

Evolution, Design, and Implementation of Bilibili's Distributed KV Storage System

This article details how Bilibili's KV storage system evolved from early solutions like Redis and MySQL to a highly scalable, high‑availability distributed architecture, describing its overall design, node components, data splitting, multi‑active disaster recovery, typical use cases, and operational challenges.

Backend EngineeringBilibiliKV
0 likes · 13 min read
Evolution, Design, and Implementation of Bilibili's Distributed KV Storage System
DataFunSummit
DataFunSummit
Aug 21, 2022 · Big Data

Alluxio Stress Testing Methods and Practices

This article explains the purpose, sources, and manifestations of pressure in Alluxio, describes its built‑in stress testing framework, outlines how to run and configure stress tools, and provides guidance on result calculation, reporting, common issues, and debugging for effective performance evaluation.

AlluxioBig DataPerformance Evaluation
0 likes · 11 min read
Alluxio Stress Testing Methods and Practices
Architect's Guide
Architect's Guide
Aug 9, 2022 · Databases

Seven Key Aspects of Distributed Storage Systems: Replication, Storage Engine, Transactions, Analytics, Multi‑core, Compute, and Compilation

The article presents a comprehensive guide to distributed storage, organizing its design and implementation into seven essential dimensions—replication, storage engine, transaction processing, analytical query execution, multi‑core scaling, compute engine architecture, and compilation techniques—each explained with core concepts, challenges, and practical considerations.

AnalyticsDatabase ArchitectureTransactions
0 likes · 13 min read
Seven Key Aspects of Distributed Storage Systems: Replication, Storage Engine, Transactions, Analytics, Multi‑core, Compute, and Compilation
ITPUB
ITPUB
Jul 28, 2022 · Cloud Native

Inside Curve: How NetEase’s Cloud‑Native Distributed Storage Beats Ceph in Performance

In this interview, NetEase’s Curve project lead Wang Pan explains the architecture, block and file storage services, performance advantages over Ceph, management modules, ongoing CPU and I/O optimizations, deployment tooling, and the open‑source roadmap that positions Curve as a high‑performance, cloud‑native storage solution.

Curveblock storagecloud-native
0 likes · 13 min read
Inside Curve: How NetEase’s Cloud‑Native Distributed Storage Beats Ceph in Performance
Meituan Technology Team
Meituan Technology Team
Jul 6, 2022 · Big Data

Meituan Distributed Storage Technology Seminar

The 2022 Meituan Distributed Storage Technology Seminar, co‑hosted by Meituan’s tech team and its science society, gathered industry and academic experts to showcase the company’s MStore meta‑server, EBS block storage, and EFS file storage architectures, discussing design, implementation challenges, and future innovations for high‑scale, cloud‑native distributed storage.

Academic SeminarBig DataData Systems
0 likes · 4 min read
Meituan Distributed Storage Technology Seminar
Meituan Technology Team
Meituan Technology Team
Jul 6, 2022 · Artificial Intelligence

Engineering Practices for Large-Scale Deep Learning Models in Meituan Takeaway Advertising

The article details Meituan's engineering journey from small DNNs to hundred‑gigabyte deep learning models for food‑delivery ads, analyzing online latency and offline efficiency challenges and presenting distributed storage, CPU/GPU acceleration, OpenVINO, TensorRT, CodeGen, and data‑pipeline optimizations that dramatically improve throughput, memory usage, and sample‑building speed.

CPU accelerationDeep LearningGPU Acceleration
0 likes · 45 min read
Engineering Practices for Large-Scale Deep Learning Models in Meituan Takeaway Advertising
DataFunSummit
DataFunSummit
Jul 2, 2022 · Big Data

Technical Evolution and Optimization of Kuaishou HDFS

Over the past four years Kuaishou's data grew dozens of times, prompting scalability and storage‑cost challenges, and this article details the architectural evolution, performance and cost optimizations, cross‑region expansion, and future plans of Kuaishou's HDFS system.

Big DataHDFSScalability
0 likes · 20 min read
Technical Evolution and Optimization of Kuaishou HDFS
DataFunTalk
DataFunTalk
Jun 5, 2022 · Big Data

JD Big Data Platform: Cross‑Region and Tiered Storage Architecture and Practices

This article presents JD's large‑scale big‑data platform, detailing its overall architecture, the challenges of cross‑region storage, the design of a unified cross‑domain data synchronization mechanism, and the implementation of tiered storage to improve performance, cost efficiency, and data reliability across multi‑datacenter clusters.

Big DataData PlatformHDFS
0 likes · 15 min read
JD Big Data Platform: Cross‑Region and Tiered Storage Architecture and Practices
IT Architects Alliance
IT Architects Alliance
May 23, 2022 · Industry Insights

Why RDMA Is Replacing TCP/IP for AI and High‑Performance Storage

The article analyzes how the AI boom and high‑performance SSD storage demand sub‑microsecond latency, exposing TCP/IP’s inherent context‑switch and CPU overhead, and explains why RDMA’s kernel‑bypass, zero‑copy design and 1 µs latency make it the preferred network stack for modern data‑center workloads despite challenges in Ethernet deployment.

AI computingData Center NetworkLow latency
0 likes · 11 min read
Why RDMA Is Replacing TCP/IP for AI and High‑Performance Storage
Ctrip Technology
Ctrip Technology
May 19, 2022 · Backend Development

Design and Implementation of a Scalable Delay Message Service Using Apache BookKeeper

This article describes how the QMQ delay‑message service was refactored by separating business logic from storage, adopting Apache BookKeeper for high‑availability, zone‑aware disaster recovery, a configurable DNS resolver, a ZooKeeper‑based task coordinator, and a multi‑level sliding‑time‑bucket scheduler to achieve a stateless, elastic architecture.

BookKeeperElastic ArchitectureJava
0 likes · 13 min read
Design and Implementation of a Scalable Delay Message Service Using Apache BookKeeper
Bilibili Tech
Bilibili Tech
May 17, 2022 · Cloud Computing

Bilibili Object Storage Service (BOSS) Design: Building a Large-Scale Distributed Storage System in 13 Days

In just 13 days Bilibili transformed a simple MySQL‑based S3 prototype into the BOSS distributed object storage system by separating metadata and data, adding an RPC abstraction layer, implementing two‑level sharding, switching to RocksDB, and deploying a three‑replica, multi‑zone high‑availability architecture.

ReplicationRocksDBS3 protocol
0 likes · 15 min read
Bilibili Object Storage Service (BOSS) Design: Building a Large-Scale Distributed Storage System in 13 Days
Architecture Digest
Architecture Digest
Mar 31, 2022 · Databases

Seven Key Aspects of Distributed Storage Systems

This article outlines the motivation and seven fundamental aspects of distributed storage—replication, storage engine, transactions, analytics, multi‑core processing, computation, and compilation—detailing their roles, challenges, and design considerations for building scalable, reliable, and high‑performance data systems.

Database ArchitectureTransactionsdistributed storage
0 likes · 14 min read
Seven Key Aspects of Distributed Storage Systems
Top Architect
Top Architect
Mar 28, 2022 · Databases

Key Aspects of Distributed Storage Systems: Replication, Engines, Transactions, Analytics, Multi‑Core, Computation, and Compilation

This article provides a comprehensive overview of distributed storage, covering seven core aspects such as replication, storage engines, transaction processing, analytical query execution, multi‑core scalability, computation models, and compilation techniques, while also highlighting practical challenges and design considerations for modern database systems.

AnalyticsCompilationStorage Engine
0 likes · 13 min read
Key Aspects of Distributed Storage Systems: Replication, Engines, Transactions, Analytics, Multi‑Core, Computation, and Compilation
IT Architects Alliance
IT Architects Alliance
Mar 21, 2022 · Databases

Seven Key Aspects of Distributed Storage Systems

This article outlines seven fundamental aspects of distributed storage—replication, storage engine, transactions, analysis, multi‑core processing, computation, and compilation—explaining their roles, challenges, and design considerations for building scalable, reliable, and high‑performance storage systems.

Database Architecturedistributed storageperformance
0 likes · 13 min read
Seven Key Aspects of Distributed Storage Systems
Bilibili Tech
Bilibili Tech
Mar 11, 2022 · Databases

Design and Architecture of Bilibili's High‑Performance Distributed KV Store

Bilibili’s high‑performance distributed KV store combines hash and range partitioning, Raft‑based multi‑replica consistency, and a Metaserver‑managed topology of pools, zones, nodes, tables, shards and replicas, offering features such as partition splitting, binlog streaming, multi‑active replication, bulk loading, KV‑storage separation, and automated load, leader and health balancing for reliable, scalable data services.

PartitioningRaft consensusbulk load
0 likes · 22 min read
Design and Architecture of Bilibili's High‑Performance Distributed KV Store
DataFunTalk
DataFunTalk
Jan 16, 2022 · Big Data

Time Series Database Capabilities and Application Scenarios in IoT, Smart Cities, and Edge Computing

This article explains the fundamentals of time‑series data, outlines the architecture and core technical advantages of Baidu Cloud's TSDB, and demonstrates how the database powers IoT, smart‑city, industrial, power‑grid, and autonomous‑driving use cases through multi‑level storage, distributed query optimization, and edge‑cloud integration.

Big DataData AnalyticsEdge Computing
0 likes · 11 min read
Time Series Database Capabilities and Application Scenarios in IoT, Smart Cities, and Edge Computing
Cloud Native Technology Community
Cloud Native Technology Community
Dec 8, 2021 · Cloud Native

Step-by-Step Guide to Build a Distributed Rook/Ceph Storage Cluster on Kubernetes

This tutorial walks you through preparing three identical VMs, installing required packages, configuring Rook and Ceph versions, deploying the storage cluster on a Kubernetes 1.20 environment, exposing the Ceph dashboard, and cleaning up the installation, complete with command examples and troubleshooting tips.

CephCloud Native StorageDeployment
0 likes · 14 min read
Step-by-Step Guide to Build a Distributed Rook/Ceph Storage Cluster on Kubernetes