Operations 16 min read

How to Fix Ceph Nearfull Warnings and Master PG/OSD Management

This guide explains why Ceph reports nearfull OSD warnings, how to adjust monitor thresholds, automate and manually reweight OSDs, interpret PG and OSD states, and perform essential cluster operations such as adding/removing OSDs, managing pools, users, and monitors using the appropriate ceph commands.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
How to Fix Ceph Nearfull Warnings and Master PG/OSD Management

Common Questions

When "nearfull osd(s) or pool(s) nearfull" appears, it means some OSDs have exceeded the configured threshold. Monitors watch OSD space usage. Raising the thresholds via configuration does not always solve the warning; analyzing OSD data distribution is more effective.

Configuration file thresholds

"mon_osd_full_ratio":"0.95",
"mon_osd_nearfull_ratio":"0.85"

Automatic handling

ceph osd reweight --by-utilization
ceph osd reweight -by-pg 105 cephfs_data(pool_name)

Manual handling

ceph osd reweight osd.2 0.8

Global handling

ceph mgr module ls
ceph mgr module enable balancer
ceph balancer on
ceph balancer mode crush-compat
ceph config-key set "mgr/balancer/max_misplaced" "0.01"

PG fault states

PG state overview A PG can be in various states during its lifecycle:

Creating – PG is being created when a pool is defined.

Peering – OSDs establish communication and reach consensus for objects.

Active – Data is fully stored and peering is complete.

Clean – All replicas are in sync and no stray PGs exist.

Degraded – Replicas are missing or an OSD is down.

Recovering – A down OSD comes back and data is being restored.

Backfilling – A new OSD joins and receives part of the data.

Remapped – Acting set changes and PG is migrating.

Stale – Monitor has not received recent reports from the acting set.

OSD states

Each OSD has two status dimensions: in/out indicates membership in the cluster, up/down indicates daemon health. They are not mutually exclusive.

in & up – normal, OSD is part of the cluster and running.

in & down – OSD is in the cluster but daemon is down; after 300 s it becomes out & down.

out & up – newly added OSD, daemon running but not yet in the cluster.

out & down – OSD removed from cluster and daemon not running; CRUSH will not place PGs on it.

Cluster monitoring and management

Overall cluster status can be inspected with commands such as:

# ceph -s
cluster:
  id: 8230a918-a0de-4784-9ab8-cd2a2b8671d0
  health: HEALTH_WARN
  services:
    mon: 3 daemons, quorum cephnode01,cephnode02,cephnode03 (age 27h)
    mgr: cephnode01 (active, since 53m), standbys: cephnode03, cephnode02
    osd: 4 osds: 4 up (since 27h), 4 in (since 19h)
    rgw: 1 daemon active (cephnode01)
  data:
    pools: 6 pools, 96 pgs
    objects: 235 objects, 3.6KiB
    usage: 4.0GiB used, 56GiB/60GiB avail
    pgs: 96 active+clean

Additional useful commands:

# ceph -w
# ceph health detail
# ceph pg dump
# ceph pg stat
# ceph osd pool stats
# ceph osd stat
# ceph osd dump
# ceph osd tree
# ceph osd df
# ceph mon stat
# ceph mon dump
# ceph quorum_status
# ceph df
# ceph df detail

Cluster configuration management (temporary and global, smooth service restart)

To view or modify a daemon's configuration without restarting the service, use the tell and daemon sub‑commands.

# ceph daemon {daemon-type}.{id} config show
# ceph daemon osd.0 config show

Tell command format

The tell command applies settings to the whole cluster (using * as a wildcard). Errors are reported directly on the command line.

# ceph tell {daemon-type}.{daemon id or *} injectargs --{name}={value} [--{name}={value}]
# ceph tell osd.0 injectargs --debug-osd 20 --debug-ms 1

Parameters:

daemon-type : osd, mon, mds, etc.

daemon id : numeric ID for OSD, monitor name for mon, or * for all.

injectargs : injects one or more arguments.

Daemon command

The daemon sub‑command sets configuration on a single daemon, providing immediate feedback.

# ceph daemon {daemon-type}.{id} config set {name}={value}
# ceph daemon mon.ceph-monitor-1 config set mon_allow_pool_delete false

Cluster operations

# systemctl start ceph.target
# systemctl start ceph-mgr.target
# systemctl start ceph-osd@id
# systemctl start ceph-mon.target
# systemctl start ceph-mds.target
# systemctl start ceph-radosgw.target

Adding and removing OSDs

Adding

# ceph-volume lvm zap /dev/sd<id>
# ceph-deploy osd create --data /dev/sd<id> $hostname

Removing

# ceph osd crush reweight osd.<ID> 0.0
# systemctl stop ceph-osd@<ID>
# ceph osd out <ID>
# ceph osd purge osd.<ID> --yes-i-really-mean-it
# umount /var/lib/ceph/osd/ceph-?

Expanding PGs

ceph osd pool set {pool-name} pg_num 128
ceph osd pool set {pool-name} pgp_num 128

Note: PG and PGP numbers should be powers of two and kept equal to allow proper rebalancing.

Pool operations

# ceph osd lspools
# ceph osd pool create {pool-name} {pg-num} [{pgp-num}]
# ceph osd pool set-quota {pool-name} max_objects 10000
# ceph osd pool delete {pool-name} {pool-name} --yes-i-really-mean-it
# ceph osd pool rename {current-pool-name} {new-pool-name}
# rados df
# ceph osd pool mksnap {pool-name} {snap-name}
# ceph osd pool rmsnap {pool-name} {snap-name}
# ceph osd pool get {pool-name} {key}
# ceph osd pool set {pool-name} {key} {value}
# ceph osd dump | grep 'replicated size'

User management

# ceph auth list
# ceph auth get client.admin
# ceph auth print-key client.admin
# ceph auth add client.john mon 'allow r' osd 'allow rw pool=liverpool'
# ceph auth get-or-create client.paul mon 'allow r' osd 'allow rw pool=liverpool'
# ceph auth caps client.john mon 'allow r' osd 'allow rw pool=liverpool'
# ceph auth del {TYPE}.{ID}

Adding and removing Monitors

# ceph-deploy mon create $hostname
# ceph-deploy mon destroy $hostname

It is recommended to run an odd number of monitors (at least three in production) to maintain quorum.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Ceph
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.