Operations 15 min read

Master Ceph: Complete Guide to Deploying and Managing a Production-Ready Distributed Storage Cluster

This comprehensive guide explains why Ceph is a leading software‑defined storage solution, details hardware and network design, walks through step‑by‑step deployment with cephadm, covers pool creation, monitoring, performance tuning, troubleshooting, scaling, backup, security hardening, and advanced automation for production environments.

Raymond Ops
Raymond Ops
Raymond Ops
Master Ceph: Complete Guide to Deploying and Managing a Production-Ready Distributed Storage Cluster

Why Choose Ceph?

Ceph is a unified, open‑source storage platform that provides object (RADOS Gateway), block (RBD), and POSIX‑compatible file system (CephFS) services. Its core advantages include true decentralization with no single point of failure, seamless horizontal scaling to petabyte levels, automatic self‑healing, and a vibrant community that avoids vendor lock‑in.

Hardware Recommendations

Monitor nodes (≥3, odd number)

CPU: 4+ cores
Memory: 8GB+
Disk: 100GB SSD (OS)
Network: Dual 10GbE (redundant)

OSD nodes (≥6 for a starter cluster)

CPU: 1 core per OSD
Memory: 4GB per OSD (BlueStore)
Disk: Enterprise SSD or high‑rpm HDD
Network: Dual 10GbE (public + cluster)

MGR nodes (≥2)

CPU: 2 cores
Memory: 4GB
Disk: System disk only

Network Architecture Design

Separate client traffic from internal cluster traffic to prevent congestion.

# Public network (client access)
10.0.1.0/24

# Cluster network (data replication & heartbeat)
10.0.2.0/24

Step‑by‑Step Deployment

Environment Preparation

# 1. System version (CentOS 8 example)
cat /etc/os-release

# 2. Time synchronization (critical)
systemctl enable --now chronyd
chrony sources -v

# 3. Firewall configuration
firewall-cmd --zone=public --add-port=6789/tcp --permanent
firewall-cmd --zone=public --add-port=6800-7300/tcp --permanent
firewall-cmd --reload

# 4. SELinux disable
setenforce 0
sed -i 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config

Install cephadm Tool

# Install official binary
curl --silent --remote-name --location https://github.com/ceph/ceph/raw/octopus/src/cephadm/cephadm
chmod +x cephadm
./cephadm add-repo --release octopus
./cephadm install

Bootstrap the Cluster

# Initialize the first monitor
cephadm bootstrap --mon-ip 10.0.1.10 --cluster-network 10.0.2.0/24

# Install Ceph CLI tools
cephadm install ceph-common

# Verify cluster status
ceph status

Successful bootstrap shows output similar to:

cluster:
  id: a7f64266-0894-4f1e-a635-d0aeaca0e993
  health: HEALTH_OK

Add OSD Nodes

# 1. Distribute SSH keys
ssh-copy-id root@node2
ssh-copy-id root@node3

# 2. Register hosts
ceph orch host add node2 10.0.1.11
ceph orch host add node3 10.0.1.12

# 3. List available disks
ceph orch device ls

# 4. Deploy OSD daemons
ceph orch daemon add osd node2:/dev/sdb
ceph orch daemon add osd node2:/dev/sdc
ceph orch daemon add osd node3:/dev/sdb
ceph orch daemon add osd node3:/dev/sdc

Create Storage Pools

# 1. Create a replicated pool (3 replicas)
ceph osd pool create mypool 128 128 replicated

# 2. Enable RBD application type
ceph osd pool application enable mypool rbd

# 3. Set CRUSH rule for rack‑level fault tolerance
ceph osd crush rule create-replicated rack_rule default rack
ceph osd pool set mypool crush_rule rack_rule

Monitoring and Performance Tuning

Key Monitoring Commands

# Cluster health details
ceph health detail

# Storage usage
ceph df

# OSD performance stats
ceph osd perf

# Slow request monitoring
ceph osd slow-requests

# Placement Group status
ceph pg stat

Optimization Parameters (in /etc/ceph/ceph.conf )

[global]
# Network tuning
ms_bind_port_max = 7300
ms_bind_port_min = 6800

# OSD tuning
osd_max_write_size = 512
osd_client_message_size_cap = 2147483648
osd_deep_scrub_interval = 2419200
osd_scrub_max_interval = 604800

# BlueStore tuning
bluestore_cache_size_hdd = 4294967296
bluestore_cache_size_ssd = 8589934592

# Recovery control
osd_recovery_max_active = 5
osd_max_backfills = 2
osd_recovery_op_priority = 2

Troubleshooting Cases

Case 1 – OSD Down

# Check health details
ceph health detail

# Locate down OSD
ceph osd tree | grep down

# Inspect OSD logs
journalctl -u ceph-osd@3 -f

# Restart OSD
systemctl restart ceph-osd@3

# If hardware failure, mark out and replace
ceph osd out 3

Case 2 – Inconsistent PG

# Find inconsistent PGs
ceph pg dump | grep inconsistent

# Repair the PG
ceph pg repair 2.3f

# Deep scrub for thorough cleanup
ceph pg deep-scrub 2.3f

Case 3 – Disk Space Exhaustion

# Check usage details
ceph df detail

# Identify the largest pools
ceph osd pool ls detail

# Temporarily raise alert thresholds
ceph config set global mon_osd_full_ratio 0.95
ceph config set global mon_osd_backfillfull_ratio 0.90
ceph config set global mon_osd_nearfull_ratio 0.85

# Long‑term fix: add OSDs or purge data
ceph orch daemon add osd node4:/dev/sdb

Capacity Planning & Expansion

Capacity Formula

Usable Capacity = Raw Capacity × (1 - ReplicationFactor/ReplicationFactor) × (1 - ReservedRatio)
# Example: 100 TB raw, 3‑replica, 10 % reserve → 30 TB usable

Smooth Expansion Procedure

# 1. Limit backfills before adding new OSDs
ceph config set global osd_max_backfills 1
ceph config set global osd_recovery_max_active 1

# 2. Add OSDs one by one
ceph orch daemon add osd node5:/dev/sdb
# Wait for data rebalance
ceph -w

# 3. Restore default settings
ceph config rm global osd_max_backfills
ceph config rm global osd_recovery_max_active

Backup & Disaster Recovery

RBD Snapshot Backup

# Create snapshot
rbd snap create mypool/myimage@snapshot1

# Export snapshot
rbd export mypool/myimage@snapshot1 /backup/myimage.snapshot1

# Enable cross‑cluster mirroring
rbd mirror pool enable mypool image
rbd mirror image enable mypool/myimage

Cluster‑Level Backup

# Export configuration
ceph config dump > /backup/ceph-config.dump

# Backup CRUSH map
ceph osd getcrushmap -o /backup/crushmap.bin

# Backup monitor data
ceph-mon --extract-monmap /backup/monmap

Advanced Operations

Automation Scripts

#!/bin/bash
# ceph-health-check.sh
LOG_FILE="/var/log/ceph-health.log"
ALERT_EMAIL="[email protected]"

check_health() {
    HEALTH=$(ceph health --format json | jq -r '.status')
    if [ "$HEALTH" != "HEALTH_OK" ]; then
        echo "$(date): Cluster health is $HEALTH" >> $LOG_FILE
        ceph health detail >> $LOG_FILE
        echo "Ceph cluster health issue detected" | mail -s "Ceph Alert" $ALERT_EMAIL
    fi
}

check_capacity() {
    USAGE=$(ceph df --format json | jq -r '.stats.total_used_ratio')
    THRESHOLD=0.80
    if (( $(echo "$USAGE > $THRESHOLD" | bc -l) )); then
        echo "$(date): Storage usage is $USAGE" >> $LOG_FILE
        echo "Storage capacity warning" | mail -s "Ceph Capacity Alert" $ALERT_EMAIL
    fi
}

main() { check_health; check_capacity; }
main

Performance Benchmarks

# RADOS benchmark
rados bench -p mypool 60 write --no-cleanup
rados bench -p mypool 60 seq
rados bench -p mypool 60 rand

# RBD benchmark
rbd create --size 10G mypool/test-image
rbd map mypool/test-image
fio --name=rbd-test --rw=randwrite --bs=4k --size=1G --filename=/dev/rbd0

# CephFS benchmark
mkdir /mnt/cephfs/test
fio --name=cephfs-test --rw=write --bs=1M --size=1G --directory=/mnt/cephfs/test

Security Hardening

# Enable authentication
ceph config set mon auth_cluster_required cephx
ceph config set mon auth_service_required cephx
ceph config set mon auth_client_required cephx

# Create a dedicated backup user
ceph auth get-or-create client.backup mon 'allow r' osd 'allow rwx pool=mypool'

# Enable encrypted network traffic
ceph config set global ms_cluster_mode secure
ceph config set global ms_service_mode secure

Log Management

# Log rotation configuration (/etc/logrotate.d/ceph)
/var/log/ceph/*.log {
    daily
    rotate 30
    compress
    sharedscripts
    postrotate
        systemctl reload ceph.target
    endscript
}

# Adjust log verbosity
ceph config set global debug_osd 1/5
ceph config set global debug_mon 1/5

Upgrade Strategy

# Pre‑upgrade health check
ceph status
ceph versions

# Perform rolling upgrade of OSDs
ceph orch upgrade start --ceph-version 15.2.14

# Monitor upgrade progress
ceph orch upgrade status

Key Takeaways

Architecture Design : Proper hardware selection and network segregation are fundamental to a stable Ceph cluster.

Monitoring & Operations : Continuous health checks, metric collection, and alerting prevent issues before they impact services.

Performance Tuning : Adjusting OSD, BlueStore, and recovery parameters tailors the cluster to specific workloads.

Fault Handling : Rapid diagnosis using health detail, OSD tree, and log inspection is essential for high availability.

Linuxdistributed storageCephCluster Deployment
Raymond Ops
Written by

Raymond Ops

Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.