Cloud Computing 25 min read

Understanding Ceph Distributed Storage Architecture and Its Core Components

Ceph is a unified, open‑source distributed storage system whose layered architecture—comprising RADOS, LIBRADOS, and upper‑level services like RADOSGW, RBD, and CephFS—provides high performance, reliability, scalability, and flexible data access for cloud, big‑data, and AI workloads.

Deepin Linux

Feb 23, 2025

Understanding Ceph Distributed Storage Architecture and Its Core Components

1. Ceph Architecture Overview

Ceph is a unified, distributed storage system designed for excellent performance, reliability, and scalability, offering object, block, and file storage interfaces.

Unified: provides object, block, and file storage.

Distributed: built on a distributed cluster algorithm.

1.1 Layered Architecture

At the base is RADOS (Reliable Autonomic Distributed Object Store), the core of Ceph, consisting of OSDs (Object Storage Daemons) and Monitors that manage storage devices and cluster state.

LIBRADOS abstracts RADOS and offers APIs for C, C++, Python, and other languages.

Above RADOS are services: RADOSGW (object gateway with S3/Swift compatibility), RBD (block device for virtual machines), and CephFS (POSIX‑compatible file system using MDS).

1.2 Key Features

Ceph’s decentralised design eliminates single points of failure, enabling horizontal scaling by adding nodes.

The CRUSH algorithm maps objects to storage devices using a deterministic hash, ensuring balanced placement, fault tolerance, and automatic recovery.

2. Core Components

2.1 OSD (Object Storage Daemon)

OSDs store and retrieve data, replicate objects according to the configured replica count, and participate in self‑healing when failures occur.

2.2 Monitor

Monitors maintain cluster maps (OSD, CRUSH, etc.), provide authentication via CephX, and ensure cluster health; a typical deployment uses at least three monitors for redundancy.

2.3 MDS (Metadata Server)

MDS manages CephFS metadata, handling namespace operations, access control, and load‑balanced metadata distribution.

2.4 RGW (Object Gateway)

RGW offers RESTful object storage compatible with Amazon S3 and OpenStack Swift, supporting multi‑tenant isolation and easy migration from existing S3/Swift applications.

2.5 RBD (Block Device)

RBD provides high‑performance block storage for virtual machines and integrates with cloud platforms such as OpenStack, supporting snapshots, cloning, and mirroring.

3. Operational Mechanisms

3.1 Write Path

Clients obtain cluster maps from monitors, split data into objects, map objects to Placement Groups (PGs) via CRUSH, select primary and secondary OSDs, write to the primary’s journal, replicate to secondaries, and confirm on successful quorum.

3.2 Read Path

Clients locate the PG and OSDs using CRUSH, read from primary or secondary OSDs, and reassemble objects into the original data.

3.3 Consistency and Fault Tolerance

Data is stored with multiple replicas (default three), CRUSH handles re‑placement on failures, and PG logs record changes to enable recovery.

4. Application Scenarios

4.1 Cloud Storage

Ceph integrates with OpenStack (RBD for block storage, RGW for object storage) to provide scalable, reliable storage for public and private clouds.

4.2 Big Data Analytics

Its parallel read/write capabilities and compatibility with Hadoop or Spark make Ceph suitable for large‑scale data processing.

4.3 Virtualization and Containerization

RBD supplies persistent block storage for VMs, while Ceph’s CSI plugin delivers dynamic volumes for Kubernetes, supporting snapshots and clones for rapid provisioning.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

architecture Big Data Cloud Computing Distributed storage Ceph

Written by

Deepin Linux

Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.