Operations 26 min read

Master Ceph: From Storage Basics to Full Cluster Deployment

This comprehensive guide explains storage fundamentals (DAS, NAS, SAN), the limitations of single‑node storage, introduces distributed storage concepts, details Ceph’s architecture, advantages, data flow, version lifecycle, and provides step‑by‑step instructions for deploying a Ceph cluster with ceph‑deploy, including monitors, OSDs, managers, and the web dashboard.

MaGe Linux Operations

Apr 1, 2025

Storage Basics

Direct‑Attached Storage (DAS) connects disks directly to a host via IDE, SATA, SCSI, SAS, or USB. Network‑Attached Storage (NAS) provides file‑level access over NFS, CIFS, or FTP. Storage Area Network (SAN) offers block‑level access using protocols such as Fibre Channel, iSCSI, or FCSAN.

Problems of Single‑Node Storage

Insufficient I/O capacity (traditional IDE ~100 IOPS, SATA SSD ~500 IOPS, high‑performance SSD 2000‑4000 IOPS).

Limited total capacity per disk.

Single point of failure.

Distributed Storage (Software‑Defined Storage)

Solutions like Ceph, TFS, FastDFS, MooseFS, HDFS, and GlusterFS spread data across many nodes, offering high scalability, performance, and availability.

Ceph Overview

What is Ceph?

Ceph is an open‑source, self‑healing, self‑managing distributed storage system written in C++. It is widely used by cloud platforms such as OpenStack and Kubernetes.

Advantages

High scalability – supports thousands of nodes and TB to EB ranges.

High reliability – no single point of failure, automatic replication and recovery.

High performance – uses the CRUSH algorithm for balanced data placement.

Unified interfaces – block (RBD), file (CephFS), and object (RGW) storage.

Architecture

Ceph consists of four layers from bottom to top:

RADOS – the core object store, composed of OSDs and Monitors.

LIBRADOS – client library providing API access to RADOS.

High‑level interfaces – RGW (object), RBD (block), CephFS (POSIX file system).

Application layer – hosts and services that consume the storage.

Key components:

OSD (Object Storage Daemon) – stores data, handles replication, balancing, and recovery.

PG (Placement Group) – virtual grouping used for data placement.

Pool – logical namespace containing a set of PGs.

Monitor – maintains cluster maps and provides quorum.

Manager – tracks runtime metrics and provides monitoring interfaces.

MDS – metadata server required for CephFS.

OSD Storage Engine

Two options are available:

Filestore – legacy engine that stores objects on a traditional file system (XFS) with a key/value database.

BlueStore – default since Luminous; writes objects directly to raw block devices for better performance and reliability.

Data Flow

Client obtains the latest cluster map from a Monitor.

Data is split into fixed‑size objects (default 4 MiB) identified by an OID (inode + object number).

OID is hashed to a PG ID: PGID = HASH(OID) % PG_NUM.

CRUSH maps the PG to one or more OSDs for replication.

Ceph Version Lifecycle

Each major release (e.g., Nautilus 14.2.0, Mimic 13) follows an annual cadence. Version format x.y.z where x is the release name index, y indicates development (0), candidate (1), or stable (2).

Deploying a Ceph Cluster with ceph‑deploy

Environment Planning

Define hostnames, public and cluster networks, and role assignments. Example /etc/hosts entries and hostname settings are used.

Preparation Steps

Disable firewalld and SELinux.

Configure password‑less SSH between nodes.

Synchronize time using chrony.

Add the Ceph yum repository.

Install required packages and ceph‑deploy.

Cluster Initialization

mkdir -p /etc/ceph
cd /etc/ceph
ceph-deploy new --public-network 192.168.10.0/24 --cluster-network 192.168.100.0/24 node01 node02 node03

This creates ceph.conf and keyring files.

Deploy Monitors and Admin Key

ceph-deploy mon create node01 node02 node03
ceph-deploy --overwrite-conf mon create-initial
ceph-deploy admin node01 node02 node03

Install OSDs

Prepare raw disks (no partitions) and create OSDs:

ceph-deploy osd create node01 --data /dev/sdb
ceph-deploy osd create node02 --data /dev/sdb
ceph-deploy osd create node03 --data /dev/sdb

Additional disks can be added later with similar commands.

Deploy Managers

ceph-deploy mgr create node01 node02

Fix the warning “mons are allowing insecure global_id reclaim”:

ceph config set mon auth_allow_insecure_global_id_reclaim false

Enable the Dashboard

yum install -y ceph-mgr-dashboard
ceph mgr module enable dashboard --force
ceph config set mgr mgr/dashboard/ssl false
ceph config set mgr mgr/dashboard/server_addr 0.0.0.0
ceph config set mgr mgr/dashboard/server_port 8000
echo "12345678" > dashboard_passwd.txt
ceph dashboard set-login-credentials admin -i dashboard_passwd.txt

Access the UI at http://<monitor_ip>:8000 using the credentials above.

Verification

Check cluster health and status:

ceph -s
ceph osd df
ceph osd tree

Typical output shows three monitors, an active manager, and OSDs in the “up” state.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Ceph OpenStack Cluster Deployment

Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.