Inside Curve: How NetEase’s Cloud‑Native Distributed Storage Beats Ceph in Performance
In this interview, NetEase’s Curve project lead Wang Pan explains the architecture, block and file storage services, performance advantages over Ceph, management modules, ongoing CPU and I/O optimizations, deployment tooling, and the open‑source roadmap that positions Curve as a high‑performance, cloud‑native storage solution.
Overview
Curve is a cloud‑native distributed storage system developed by NetEase. It provides block storage and file storage services that integrate with OpenStack, Kubernetes and other cloud platforms. The system is designed for high performance, stable latency and easy operation at petabyte scale.
Block and File Storage
Block storage targets cloud‑native databases that require compute‑storage separation. It offers low‑latency, high‑IOPS block devices and supports PolarFS‑based databases such as PolarDB for PostgreSQL.
File storage stores data in an object‑storage backend as a middleware layer. It is optimized for cold‑data and AI/ML training workloads. A hot‑data cache layer based on block storage is under development.
Architecture
Key architectural choices that differentiate Curve from Ceph:
Consistency protocol: Curve uses the Raft consensus algorithm with quorum‑based fault tolerance, providing stronger availability under failures compared with Ceph’s Paxos implementation.
Metadata management: A centralized metadata service (MDS) together with etcd stores cluster topology, volume information and space allocation, enabling real‑time scheduling and capacity balancing. Ceph relies on the CRUSH hash algorithm.
IO path: At cluster initialization Curve creates fixed‑size “chunkfilepool” files. This reduces metadata overhead and allows most writes to complete after WAL and cache writes, whereas Ceph requires three‑replica writes before acknowledgement.
Additional features: Block storage snapshots can be off‑loaded to S3‑compatible object stores; file storage can persist data directly to object stores and will later support lifecycle management between block and object layers.
Management Modules
Block storage management consists of:
MDS – metadata server for topology and volume metadata.
etcd – key‑value store for persistent cluster state.
File storage adds a metaserver component that stores filesystem‑specific metadata in addition to MDS and etcd.
Performance and CPU Optimizations
Current optimization focus includes:
Reducing write amplification by using file pools.
Configuring replication groups in multi‑Raft setups to lower CPU consumption.
Exploring kernel‑bypass techniques such as RDMA for network IO and SPDK for storage IO.
Using the brpc messaging framework and braft’s bthread for efficient thread scheduling.
Benchmarks are being collected and will be released with future releases.
Deployment and Operations
Curve provides the self‑developed curveadm tool for automated deployment, upgrade, scaling and fault handling. Typical workflow:
For a test environment run curveadm playground (single command).
For production, prepare a topology.yaml file based on the provided template and execute curveadm deploy -c topology.yaml.
Operations such as expansion, upgrade and iSCSI target provisioning are performed with sub‑commands of curveadm.
Monitoring dashboards are integrated to expose latency, IOPS and resource usage.
Roadmap
Planned evolution focuses on three pillars:
Continuous performance tuning to become the fastest open‑source storage solution.
Enhanced data reliability with robust recovery tools for extreme failure scenarios.
Improved cost‑performance balance for mixed‑workload environments.
Detailed roadmap is available on the Curve GitHub wiki: https://github.com/opencurve/curve/wiki/Roadmap_CN
Open‑Source Community
Curve has been donated to the CNCF foundation, ensuring long‑term community support. The project encourages contributions, bi‑weekly community meetings and production‑grade deployments.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
