Cloud Native 4 min read

How to Enable Ceph Enterprise Monitoring with Prometheus & Grafana

Learn step‑by‑step how to activate Ceph’s monitoring modules, configure Prometheus to collect Ceph metrics, verify data collection, and integrate Grafana dashboards, including tips on required dependencies and troubleshooting, to ensure reliable, secure storage management in enterprise cloud‑native environments.

Linux Ops Smart Journey
Linux Ops Smart Journey
Linux Ops Smart Journey
How to Enable Ceph Enterprise Monitoring with Prometheus & Grafana

In today’s data‑driven world, ensuring efficient and secure storage systems is critical; Ceph provides an open‑source distributed storage solution with a powerful monitoring panel for cluster health and performance.

Prevention is better than cure – monitor first.

1. Enable Ceph monitoring

<code>$ ceph mgr module enable prometheus</code>

Tip: If the mgr host lacks the cherrypy module, the command will fail.

Solution:

<code>pip3 install cherrypy -i https://mirrors.aliyun.com/pypi/simple/
sudo systemctl restart ceph-mgr.target</code>

2. Enable RBD monitoring

<code>$ ceph config set mgr mgr/prometheus/rbd_stats_pools "kubernetes,cephfs-data,cephfs-metadata"</code>

Tip: To monitor all RBD pools, set the value to "*".

Prometheus collection of Ceph metrics

Edit the Prometheus ConfigMap to add a job named “ceph” with the target nodes:

<code>$ kubectl -n kube-system edit cm prometheus
  - job_name: 'ceph'
    static_configs:
    - targets:
      - "172.139.20.20:9283"
      - "172.139.20.208:9283"
      - "172.139.20.94:9283"</code>

Verify collection success with curl:

<code>$ curl -s $(kubectl -n kube-system get svc prometheus -ojsonpath='{.spec.clusterIP}:{.spec.ports[0].port}')/prometheus/api/v1/query --data-urlencode 'query=up{job=~"ceph.*"}' | jq '.data.result[] | {job: .metric.job, instance: .metric.instance ,status: .value[1]}'</code>

Sample output shows each Ceph instance returning status “1”.

Grafana: add Ceph monitoring dashboards

Download the official Ceph dashboard files from the Ceph repository:

https://github.com/ceph/ceph/tree/main/monitoring/ceph-mixin/dashboards_out

Tip: The dashboards rely on node‑exporter metrics.

Conclusion

Ceph’s enterprise‑grade monitoring panel is essential for managing large‑scale distributed storage, improving reliability and efficiency while safeguarding data assets; proper configuration and continuous optimization enable stable operation and support business growth in digital transformation.

Monitoringcloud-nativePrometheusCephGrafana
Linux Ops Smart Journey
Written by

Linux Ops Smart Journey

The operations journey never stops—pursuing excellence endlessly.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.