Mastering Ceph Object Storage: From Concepts to RADOS Gateway Deployment
This guide explains object storage fundamentals, the bucket model, Ceph's RADOS Gateway (RGW) architecture, step‑by‑step installation (online and offline), configuration, keyring creation, pool setup, troubleshooting PG limits, user creation, and how to access RGW using AWS CLI tools.
What is Object Storage
Object storage manages data as independent objects rather than using traditional hierarchical file systems or block storage. Each object contains the data, metadata (such as creation date, type, etc.), and a unique identifier.
It is mainly used for unstructured data like multimedia, backups, and analytics, and suits applications that require massive, easily accessible, cost‑effective storage. Public cloud services such as Amazon S3, Google Cloud Storage, and OpenStack Swift provide object storage, while Ceph and MinIO are common in private clouds or on‑premises.
Unlike file storage, object storage does not use a directory tree. All data are treated as objects identified by a globally unique ID, and each object includes its data, metadata, and the ID.
The advantages of object storage are scalability and accessibility; it is designed for large‑scale unstructured data, can distribute objects across many servers or regions, offers high redundancy and availability, and is accessed via RESTful APIs, making integration into applications straightforward.
Bucket Concept in Object Storage
A bucket is a logical container used to organize and manage stored objects. Each bucket has a unique name and serves as a flat namespace where objects can be stored, listed, and deleted.
Buckets cannot be nested like folders; they are independent and can have individual configurations such as access permissions and lifecycle rules.
What is RGW
RGW (RADOS Gateway) is a FastCGI service built on the LIBRADOS interface that provides HTTP‑based RESTful APIs compatible with Amazon S3 and OpenStack Swift, allowing a Ceph cluster to function as an object storage system.
Provides S3/Swift compatible APIs.
Supports ACLs and access control mechanisms.
Offers data redundancy and replication.
Handles massive amounts of data.
Supports multi‑tenant environments.
Installing Ceph RADOS Gateway
Online Installation
sudo apt install ceph-radosgwOffline Installation
First install apt-rdepends in a connected environment:
sudo apt-get update
sudo apt-get install apt-rdependsList all recursive dependencies of the radosgw package:
apt-rdepends radosgw | grep -v "^ " > packages.txtDownload the packages:
mkdir packages
cd packages
xargs -a ../packages.txt apt-get downloadCopy the packages directory to the offline machine and install:
tar -zcvf ceph_radosgw.tar.gz packages
# On the offline machine
tar -zxvf ceph_radosgw.tar.gz
cd packages
dpkg -i *.debConfiguring Ceph RADOS Gateway
Add a [client.rgw] section for each RGW host in /etc/ceph/ceph.conf so each instance can be started independently.
[client.rgw.node1]
host = node1
rgw_frontends = "civetweb port=80"
[client.rgw.node2]
host = node2
rgw_frontends = "civetweb port=80"Creating a Keyring
Create a keyring for the RGW daemon and set proper ownership:
sudo mkdir -p /var/lib/ceph/radosgw/ceph-rgw.`hostname`
sudo ceph-authtool /var/lib/ceph/radosgw/ceph-rgw.`hostname`/keyring --create-keyring --gen-key -n client.rgw.`hostname`
sudo chown ceph:ceph /var/lib/ceph/radosgw/ceph-rgw.`hostname`/keyringAdd the keyring to the Ceph cluster:
sudo ceph auth add client.rgw.`hostname` osd 'allow rwx' mon 'allow rwx' -i /var/lib/ceph/radosgw/ceph-rgw.`hostname`/keyringRestarting Ceph RADOS Gateway
sudo systemctl start ceph-radosgw@rgw.`hostname`Creating RGW Data Pools
Create the pools required by RGW:
ceph osd pool create .rgw.root 64
ceph osd pool create default.rgw.control 64
ceph osd pool create default.rgw.data.root 64
ceph osd pool create default.rgw.gc 64
ceph osd pool create default.rgw.log 64
ceph osd pool create default.rgw.users.uid 64
ceph osd pool create default.rgw.users.email 64
ceph osd pool create default.rgw.users.swift 64
ceph osd pool create default.rgw.buckets.index 64
ceph osd pool create default.rgw.buckets.data 64Pool purposes:
.rgw.root – stores RGW configuration and metadata.
default.rgw.control – control data for RGW.
default.rgw.data.root – metadata of newly created buckets.
default.rgw.gc – objects pending garbage collection.
default.rgw.log – access logs.
default.rgw.users.uid / email / swift – user information.
default.rgw.buckets.index – bucket index for fast lookup.
default.rgw.buckets.data – actual object data.
Handling PG Limit Errors
If creating pools exceeds the cluster's PG limit, you may see an error like:
pg_num 64 size 3 would mean 771 total pgs, which exceeds max 750 (mon_max_pg_per_osd 250 * num_in_osds 3)Solutions:
Increase mon_max_pg_per_osd (e.g., to 300):
ceph config set global mon_max_pg_per_osd 300
ceph config get mon mon_max_pg_per_osdAdd more OSDs to raise the overall PG capacity.
Reduce pg_num values, balancing distribution and performance.
Creating Users with radosgw-admin
radosgw-admin user create --uid "wanger" --display-name "wanger"Save the generated access_key and secret_key for client configuration.
Using AWS CLI with RGW
Install and configure AWS CLI to interact with the RGW S3‑compatible endpoint:
apt-get install awscli
aws configure # provide access_key, secret_key, region (e.g., us-east-1), output format (json)Create a bucket:
aws s3api create-bucket --bucket mybucket --endpoint-url http://node1Upload a file:
aws s3 cp myfile.txt s3://mybucket/myfile.txt --endpoint-url http://node1List objects:
aws s3 ls s3://mybucket --endpoint-url http://node1Delete an object:
aws s3 rm s3://mybucket/myfile.txt --endpoint-url http://node1Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Ops Development Stories
Maintained by a like‑minded team, covering both operations and development. Topics span Linux ops, DevOps toolchain, Kubernetes containerization, monitoring, log collection, network security, and Python or Go development. Team members: Qiao Ke, wanger, Dong Ge, Su Xin, Hua Zai, Zheng Ge, Teacher Xia.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
