Ceph Deployment and Usage at Tongcheng: Architecture, Applications, Pain Points, and Future Work
This article details Tongcheng's adoption of Ceph for handling massive unstructured data, describing the motivations, architecture, specific uses of RGW, block storage, and CephFS, the challenges encountered, optimization measures, monitoring practices, and planned future improvements.
1. Ceph usage background
Tongcheng generates large volumes of unstructured data—documents, XML, HTML, reports, audio/video, etc.—which were previously stored on local disks using a single‑node file system, leading to single‑point failures, difficult scaling, low reliability, high metadata overhead, and lack of migration capability.
Single‑point failure
Scaling difficulties
Inability to achieve high data reliability and high availability
Metadata management overhead causing performance bottlenecks
Stateful applications on local storage cannot be migrated
OS‑bound storage; bugs in the file system can bring down the whole system
Object storage is the mainstream solution for unstructured data, offering RESTful APIs, flat data organization, massive scalability, multi‑tenant security, and elastic resource pools.
For Tongcheng's private cloud, a highly reliable, elastically scalable block storage service is required.
After evaluating many open‑source distributed storage systems, Ceph was selected because of its advanced decentralized architecture, active community, support for object, block, and file storage, and deep integration with OpenStack.
2. Ceph usage in Tongcheng
2.1 RGW for unstructured data storage
RGW provides S3/Swift‑compatible interfaces, allowing direct use of AWS S3 SDKs. Multiple RGW instances are load‑balanced by Nginx with Keepalived for high availability. Nginx also compresses static CSS/JS assets via Lua and implements automatic cache cleaning.
2.2 Block storage for private cloud platform
Ceph serves as the storage backend for OpenStack services Nova, Glance, and Cinder, eliminating the need for separate storage solutions. RBD snapshots enable near‑instant VM creation and reduce backup time and space consumption.
Since Ceph lacks native QoS, QEMU QoS is used to limit IOPS and bandwidth.
2.3 CephFS for shared data between servers
CephFS replaces legacy NAS, providing POSIX‑compatible access via Kernel client and FUSE. The current deployment uses active‑standby MDS and disables directory sharding for stability, with plans to integrate OpenStack Manila.
3. Pain points in Ceph usage
Key issues include log‑disk write patterns, the need for SSD+HDD hybrid OSDs, bandwidth limitations of SATA drives, and the large size of bucket index shards causing scrub and recovery performance problems.
Large bucket index shards increase scrub time and I/O latency.
Recovery of oversized shards after OSD failures can cause client I/O timeouts.
Optimizations applied:
Configure multiple bucket shards to limit shard size.
Use SSD for the buckets.index pool to speed up recovery.
Enable bucket quota to control shard growth.
Ceph’s pre‑L version lacked a dashboard; a custom management platform was built using RGW admin APIs, Cinder APIs, and Nova APIs.
4. Monitoring focus
OSD health – down/up events trigger peering and object recovery, potentially hanging client I/O.
Monitor quorum – loss of more than one monitor renders the cluster unavailable.
I/O latency – long‑running requests may indicate a slow OSD.
Cluster capacity and load – guide scaling decisions.
5. Future work
Ceph is likened to a gold mine; current extraction is limited. Future efforts will explore more hardware configurations, expand storage interfaces (e.g., adding iSCSI support), deepen usage across services, and improve stability and reliability.
Tongcheng Travel Technology Center
Pursue excellence, start again with Tongcheng! More technical insights to help you along your journey and make development enjoyable.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.