Cloud Computing 39 min read

Ceph Storage Architecture: Overview, Cluster Design, Client Interfaces, and Encryption

This article provides a comprehensive technical overview of Red Hat Ceph, covering its distributed storage architecture, cluster components, storage pool types, authentication, placement algorithms, I/O paths, replication and erasure‑coding strategies, internal management operations, high‑availability mechanisms, client libraries, data striping, and encryption details.

Architects' Tech Alliance

Nov 9, 2020

Ceph Storage Architecture: Overview, Cluster Design, Client Interfaces, and Encryption

Red Hat Ceph is a distributed object storage system designed for high performance, reliability, and scalability, offering multiple access interfaces such as native language bindings (C/C++, Java, Python), RESTful S3/Swift APIs, block devices, and file system mounts.

The storage cluster consists of two main daemon types: Ceph OSDs, which store data and handle replication, rebalancing, recovery, and monitoring, and Ceph monitors (Mon), which maintain a master copy of the cluster map.

Clients interact with the cluster using a configuration file, pool name, and user credentials. They obtain the latest cluster map from a monitor, compute the placement group (PG) and target OSD via the CRUSH algorithm, and then communicate directly with the primary OSD for read/write operations.

Storage pools can be of two types: replicated pools, which keep multiple copies of objects, and erasure‑coded pools, which split objects into K data blocks and M coding blocks, allowing data recovery even if several OSDs fail.

Ceph uses the CRUSH algorithm to map objects to PGs and PGs to OSDs, supporting fault‑domain and performance‑domain awareness, and enabling dynamic data rebalancing when OSDs are added or removed.

I/O operations are performed by clients that provide only the object ID and pool name; CRUSH determines the PG ID and the acting set of OSDs. Replicated I/O writes to a primary OSD, which then propagates the data to secondary OSDs, while erasure‑coded I/O writes encoded blocks to a set of OSDs.

Internal cluster management includes heartbeat monitoring, OSD state synchronization, automatic data rebalancing, and scrubbing (integrity checking). High availability is achieved through multiple monitors, the CephX authentication protocol, and configurable replica or erasure‑coding settings.

Client architecture is divided into three chapters: the overview, the cluster architecture, and the client interfaces. The client side includes the native librados library, object watch/notify mechanisms, exclusive locks for RBD images, object map indexing to track existing objects, and data striping to improve throughput.

Data striping works similarly to RAID 0, distributing data across multiple objects and OSDs; parameters such as object size, stripe width, and stripe count can be tuned for performance.

Encryption is supported via LUKS disk encryption for OSD data and journal partitions. Ceph‑ansible can automate the creation of encrypted OSDs, storing LUKS keys in monitor KV stores and using dm‑crypt devices for transparent decryption at service start.

Key command examples:

rbd -p mypool create myimage --size 102400 --image-features 5

rbd -p mypool create myimage --size 102400 --image-features 13

Overall, the article serves as a detailed technical guide for understanding and deploying Ceph storage solutions in cloud and data‑center environments.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Encryption cloud storage Distributed storage erasure-coding Ceph CRUSH Data Striping

Written by

Architects' Tech Alliance

Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.