Master Ceph: Step‑by‑Step Guide to Deploy a Scalable Distributed Storage Cluster
Learn how to design, configure, and deploy a Ceph distributed storage cluster using ceph‑deploy, covering storage fundamentals, Ceph architecture, component roles, network planning, OS preparation, mon, mgr, osd setup, and dashboard activation, with detailed commands and best‑practice recommendations for production environments.
Storage Basics
Traditional storage is classified as DAS (direct‑attached), NAS (network‑attached) and SAN (storage‑area network). DAS connects disks directly to a host via IDE, SATA, SCSI, SAS or USB. NAS provides file‑level access using protocols such as NFS, CIFS or FTP. SAN offers block‑level storage over SCSI, FCSAN or iSCSI.
Single‑machine storage suffers from limited I/O capacity, insufficient capacity and single‑point failures, making it unsuitable for large‑scale services. Software‑defined storage (SDS) solutions such as Ceph, FastDFS, MooseFS, HDFS and GlusterFS distribute data across multiple nodes, providing high scalability, performance and availability.
Ceph Overview
Ceph is an open‑source, self‑healing, self‑managing distributed storage system written in C++. It provides block (RBD), file (CephFS) and object (RGW) interfaces and integrates with OpenStack and Kubernetes.
Advantages
High scalability – decentralized design supports thousands of nodes and petabyte‑scale storage.
High reliability – no single point of failure; automatic replication and recovery.
High performance – CRUSH algorithm balances data placement and maximizes parallelism.
Rich functionality – block, file and object interfaces in a single platform.
Architecture
Ceph consists of four logical layers:
RADOS – core object store composed of OSD daemons and Monitor daemons.
LIBRADOS – client library exposing APIs for RBD, CephFS and RGW.
High‑level interfaces – RBD (block), RGW (object, S3/Swift compatible), CephFS (POSIX file system).
Application layer – hosts that consume the above interfaces.
Key components:
OSD (Object Storage Daemon) – stores data, replicates, back‑fills and reports health; at least three OSDs are required for redundancy.
PG (Placement Group) – virtual grouping of objects; objects are hashed to a PG, which CRUSH maps to OSDs.
Pool – logical namespace containing a configurable number of PGs; can use replicated or erasure‑coded storage.
Monitor (MON) – maintains cluster maps (OSD map, PG map, CRUSH map) and provides quorum; typically an odd number of monitors (3 or 5).
Manager (MGR) – gathers metrics, provides monitoring interfaces (Prometheus, Zabbix) and a dashboard.
MDS (Metadata Server) – required only for CephFS to store metadata.
OSD Storage Engines
Ceph supports two OSD backends. Filestore stores objects on a traditional file system (XFS) with a key/value DB (LevelDB/RocksDB). Bluestore (default since Ceph Luminous 12.2.0) writes objects directly to raw block devices, offering lower latency and higher throughput.
Data Flow
Clients obtain the latest cluster map from a monitor.
Data is split into fixed‑size objects (default 4 MiB) identified by an OID (FileID + object number).
The OID is hashed; the hash modulo the number of PGs yields a PG ID.
CRUSH maps the PG to one or more OSDs based on the configured replica count.
Version Lifecycle
Ceph releases a new stable version each year (e.g., Nautilus 14, Mimic 13, Octopus 15). Version numbers follow x.y.z where x is the release series, y indicates development (0), candidate (1) or stable (2) releases.
Deployment Options
Common methods: ceph-deploy – mature automation tool. cephadm – container‑based installer (recommended for newer releases).
Manual binary installation – full control, higher complexity.
Planning a Ceph Cluster
Production guidelines:
Use 10 GbE or faster networks; separate public and cluster networks.
Deploy MON, MGR and OSD services on distinct hosts (or combine in test labs).
Minimum hardware: Intel Xeon E5‑2620 v3, ≥64 GiB RAM, SATA disks are acceptable.
Distribute nodes across racks to avoid single‑point power or network failures.
Sample Host Layout
Hostname PublicIP ClusterIP Roles
admin 192.168.10.120 admin (management), client
node01 192.168.10.121 192.168.100.11 mon,mgr,osd(/dev/sdb,/dev/sdc,/dev/sdd)
node02 192.168.10.122 192.168.100.12 mon,mgr,osd(/dev/sdb,/dev/sdc,/dev/sdd)
node03 192.168.10.123 192.168.100.13 mon,osd(/dev/sdb,/dev/sdc,/dev/sdd)
client 192.168.10.124 clientPreparation Steps (on each node)
Disable SELinux and firewalld:
systemctl disable --now firewalld
setenforce 0
sed -i 's/enforcing/disabled/' /etc/selinux/configSet hostnames:
hostnamectl set-hostname admin # repeat for node01, node02, node03, clientConfigure /etc/hosts for name resolution.
Create a dedicated Ceph user with password‑less sudo:
useradd cephadm
passwd cephadm
visudo # add: cephadm ALL=(root) NOPASSWD:ALLInstall common dependencies (epel-release, ntp, development tools, etc.).
Set up password‑less SSH from the admin node to all other nodes.
Synchronize time using chronyd (or ntpdate).
Add the official Ceph repository and install ceph-deploy (example for Nautilus):
wget https://download.ceph.com/rpm-nautilus/el7/noarch/ceph-release-1-1.el7.noarch.rpm --no-check-certificate
rpm -ivh ceph-release-1-1.el7.noarch.rpm --force
yum install -y ceph-deployCluster Bootstrap with ceph-deploy
Create the configuration directory on the admin node:
mkdir -p /etc/cephGenerate an initial ceph.conf specifying public and cluster networks and the monitor hosts:
cd /etc/ceph
ceph-deploy new --public-network 192.168.10.0/24 --cluster-network 192.168.100.0/24 node01 node02 node03Install Ceph packages on all nodes (specify release if needed):
ceph-deploy install --release nautilus node01 node02 node03 adminCreate the initial monitor quorum and distribute keys:
ceph-deploy mon create-initial
ceph-deploy gatherkeys node01Push the generated ceph.conf and admin keyring to all nodes:
ceph-deploy admin node01 node02 node03Deploy OSDs (example using /dev/sdb on each node):
ceph-deploy osd create node01 --data /dev/sdb
ceph-deploy osd create node02 --data /dev/sdb
ceph-deploy osd create node03 --data /dev/sdbDeploy manager daemons (at least two for active/standby):
ceph-deploy mgr create node01 node02Verify cluster health:
ceph -sResolving Common WARN
If the health output shows mons are allowing insecure global_id reclaim, disable the insecure mode:
ceph config set mon auth_allow_insecure_global_id_reclaim falseEnabling the Ceph Dashboard
Install the dashboard module on manager nodes:
yum install -y ceph-mgr-dashboard
ceph mgr module enable dashboard --forceDisable SSL for simplicity (optional):
ceph config set mgr mgr/dashboard/ssl falseSet listening address and port:
ceph config set mgr mgr/dashboard/server_addr 0.0.0.0
ceph config set mgr mgr/dashboard/server_port 8000Restart the manager to apply changes:
ceph mgr module disable dashboard
ceph mgr module enable dashboard --forceCreate an admin user for the UI:
echo "12345678" > dashboard_passwd.txt
ceph dashboard set-login-credentials admin -i dashboard_passwd.txt
# or
ceph dashboard ac-user-create admin administrator -i dashboard_passwd.txtAccess the dashboard at http://<MON_IP>:8000 using the credentials above.
Post‑Deployment Checks
Monitor status: ceph mon stat OSD status (requires active mgr): ceph osd stat and ceph osd df Placement group distribution: ceph osd tree Pool usage:
rados dfSigned-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
