Cloud Computing 28 min read

Master Ceph: Step‑by‑Step Guide to Deploy a Scalable Distributed Storage Cluster

Learn how to design, configure, and deploy a Ceph distributed storage cluster using ceph‑deploy, covering storage fundamentals, Ceph architecture, component roles, network planning, OS preparation, mon, mgr, osd setup, and dashboard activation, with detailed commands and best‑practice recommendations for production environments.

Liangxu Linux
Liangxu Linux
Liangxu Linux
Master Ceph: Step‑by‑Step Guide to Deploy a Scalable Distributed Storage Cluster

Storage Basics

Traditional storage is classified as DAS (direct‑attached), NAS (network‑attached) and SAN (storage‑area network). DAS connects disks directly to a host via IDE, SATA, SCSI, SAS or USB. NAS provides file‑level access using protocols such as NFS, CIFS or FTP. SAN offers block‑level storage over SCSI, FCSAN or iSCSI.

Single‑machine storage suffers from limited I/O capacity, insufficient capacity and single‑point failures, making it unsuitable for large‑scale services. Software‑defined storage (SDS) solutions such as Ceph, FastDFS, MooseFS, HDFS and GlusterFS distribute data across multiple nodes, providing high scalability, performance and availability.

Ceph Overview

Ceph is an open‑source, self‑healing, self‑managing distributed storage system written in C++. It provides block (RBD), file (CephFS) and object (RGW) interfaces and integrates with OpenStack and Kubernetes.

Advantages

High scalability – decentralized design supports thousands of nodes and petabyte‑scale storage.

High reliability – no single point of failure; automatic replication and recovery.

High performance – CRUSH algorithm balances data placement and maximizes parallelism.

Rich functionality – block, file and object interfaces in a single platform.

Architecture

Ceph consists of four logical layers:

RADOS – core object store composed of OSD daemons and Monitor daemons.

LIBRADOS – client library exposing APIs for RBD, CephFS and RGW.

High‑level interfaces – RBD (block), RGW (object, S3/Swift compatible), CephFS (POSIX file system).

Application layer – hosts that consume the above interfaces.

Key components:

OSD (Object Storage Daemon) – stores data, replicates, back‑fills and reports health; at least three OSDs are required for redundancy.

PG (Placement Group) – virtual grouping of objects; objects are hashed to a PG, which CRUSH maps to OSDs.

Pool – logical namespace containing a configurable number of PGs; can use replicated or erasure‑coded storage.

Monitor (MON) – maintains cluster maps (OSD map, PG map, CRUSH map) and provides quorum; typically an odd number of monitors (3 or 5).

Manager (MGR) – gathers metrics, provides monitoring interfaces (Prometheus, Zabbix) and a dashboard.

MDS (Metadata Server) – required only for CephFS to store metadata.

OSD Storage Engines

Ceph supports two OSD backends. Filestore stores objects on a traditional file system (XFS) with a key/value DB (LevelDB/RocksDB). Bluestore (default since Ceph Luminous 12.2.0) writes objects directly to raw block devices, offering lower latency and higher throughput.

Data Flow

Clients obtain the latest cluster map from a monitor.

Data is split into fixed‑size objects (default 4 MiB) identified by an OID (FileID + object number).

The OID is hashed; the hash modulo the number of PGs yields a PG ID.

CRUSH maps the PG to one or more OSDs based on the configured replica count.

Version Lifecycle

Ceph releases a new stable version each year (e.g., Nautilus 14, Mimic 13, Octopus 15). Version numbers follow x.y.z where x is the release series, y indicates development (0), candidate (1) or stable (2) releases.

Deployment Options

Common methods: ceph-deploy – mature automation tool. cephadm – container‑based installer (recommended for newer releases).

Manual binary installation – full control, higher complexity.

Planning a Ceph Cluster

Production guidelines:

Use 10 GbE or faster networks; separate public and cluster networks.

Deploy MON, MGR and OSD services on distinct hosts (or combine in test labs).

Minimum hardware: Intel Xeon E5‑2620 v3, ≥64 GiB RAM, SATA disks are acceptable.

Distribute nodes across racks to avoid single‑point power or network failures.

Sample Host Layout

Hostname   PublicIP        ClusterIP        Roles
admin      192.168.10.120                admin (management), client
node01     192.168.10.121  192.168.100.11   mon,mgr,osd(/dev/sdb,/dev/sdc,/dev/sdd)
node02     192.168.10.122  192.168.100.12   mon,mgr,osd(/dev/sdb,/dev/sdc,/dev/sdd)
node03     192.168.10.123  192.168.100.13   mon,osd(/dev/sdb,/dev/sdc,/dev/sdd)
client     192.168.10.124                client

Preparation Steps (on each node)

Disable SELinux and firewalld:

systemctl disable --now firewalld
setenforce 0
sed -i 's/enforcing/disabled/' /etc/selinux/config

Set hostnames:

hostnamectl set-hostname admin   # repeat for node01, node02, node03, client

Configure /etc/hosts for name resolution.

Create a dedicated Ceph user with password‑less sudo:

useradd cephadm
passwd cephadm
visudo   # add: cephadm ALL=(root) NOPASSWD:ALL

Install common dependencies (epel-release, ntp, development tools, etc.).

Set up password‑less SSH from the admin node to all other nodes.

Synchronize time using chronyd (or ntpdate).

Add the official Ceph repository and install ceph-deploy (example for Nautilus):

wget https://download.ceph.com/rpm-nautilus/el7/noarch/ceph-release-1-1.el7.noarch.rpm --no-check-certificate
rpm -ivh ceph-release-1-1.el7.noarch.rpm --force
yum install -y ceph-deploy

Cluster Bootstrap with ceph-deploy

Create the configuration directory on the admin node:

mkdir -p /etc/ceph

Generate an initial ceph.conf specifying public and cluster networks and the monitor hosts:

cd /etc/ceph
ceph-deploy new --public-network 192.168.10.0/24 --cluster-network 192.168.100.0/24 node01 node02 node03

Install Ceph packages on all nodes (specify release if needed):

ceph-deploy install --release nautilus node01 node02 node03 admin

Create the initial monitor quorum and distribute keys:

ceph-deploy mon create-initial
ceph-deploy gatherkeys node01

Push the generated ceph.conf and admin keyring to all nodes:

ceph-deploy admin node01 node02 node03

Deploy OSDs (example using /dev/sdb on each node):

ceph-deploy osd create node01 --data /dev/sdb
ceph-deploy osd create node02 --data /dev/sdb
ceph-deploy osd create node03 --data /dev/sdb

Deploy manager daemons (at least two for active/standby):

ceph-deploy mgr create node01 node02

Verify cluster health:

ceph -s

Resolving Common WARN

If the health output shows mons are allowing insecure global_id reclaim, disable the insecure mode:

ceph config set mon auth_allow_insecure_global_id_reclaim false

Enabling the Ceph Dashboard

Install the dashboard module on manager nodes:

yum install -y ceph-mgr-dashboard
ceph mgr module enable dashboard --force

Disable SSL for simplicity (optional):

ceph config set mgr mgr/dashboard/ssl false

Set listening address and port:

ceph config set mgr mgr/dashboard/server_addr 0.0.0.0
ceph config set mgr mgr/dashboard/server_port 8000

Restart the manager to apply changes:

ceph mgr module disable dashboard
ceph mgr module enable dashboard --force

Create an admin user for the UI:

echo "12345678" > dashboard_passwd.txt
ceph dashboard set-login-credentials admin -i dashboard_passwd.txt
# or
ceph dashboard ac-user-create admin administrator -i dashboard_passwd.txt

Access the dashboard at http://<MON_IP>:8000 using the credentials above.

Post‑Deployment Checks

Monitor status: ceph mon stat OSD status (requires active mgr): ceph osd stat and ceph osd df Placement group distribution: ceph osd tree Pool usage:

rados df
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Dashboardstorage architecturedistributed storageCeph
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.