Why Modern Cloud Storage Is Getting So Complex—and How Qiniu Solved It
From the evolution of single‑machine file systems to today’s distributed, erasure‑coded cloud storage, this article examines why storage has become increasingly complex, the limitations of traditional replication, and how Qiniu’s next‑gen architecture leverages EC, faster repairs, and cost reductions to meet scalability, reliability, and availability demands.
Storage systems have become increasingly complex as they evolved from simple single‑machine file systems to distributed storage middleware and now to cloud storage services.
Traditional storage focused on basic reliability: handling power loss, program crashes, and disk sector failures. In modern systems, handling a wide variety of error cases becomes the core business logic of storage.
With the rise of client‑server architectures, availability became a key metric. High‑availability storage middleware allows services to stay online by persisting state centrally and enabling multiple instances to provide failover and load balancing.
Databases emerged as the first storage middleware, but they are not the only solution. Rich media files are often stored in file systems, which face scalability, performance, and reliability challenges when used at large scale.
Scalability – single‑node file systems cannot handle capacity beyond one machine.
Performance bottlenecks – file system performance degrades sharply after reaching a critical number of files.
Reliability – single‑copy designs cannot meet modern redundancy requirements; multiple replicas or erasure coding are needed.
Availability – a single‑node failure makes data inaccessible.
Early attempts such as adding RAID5 to single‑node file systems only partially address reliability and still suffer from limited repair speed.
Google’s GFS introduced the three‑replica model that inspired Hadoop’s HDFS, but HDFS is optimized for large log files, not for massive small‑file workloads like images or videos.
HDFS block size (64 MiB) wastes space for small files.
Single‑master metadata limits scalability for many small objects.
File‑system‑style directory structures are hard to maintain in a distributed environment.
Key‑value (object) storage emerged as the natural fit for small‑file workloads, leading to systems like Qiniu Cloud Storage, which treats files as objects without a hierarchical directory.
Qiniu’s next‑generation storage (v2) combines a distributed storage cluster, upload acceleration, and a data‑processing cluster. Its design goals focus on cost, reliability, and scalability.
Cost – using erasure coding (28 + 4) reduces storage hardware needs to about 36.5 % of a three‑replica system for the same capacity.
Reliability – tolerates up to four simultaneous disk failures and speeds up repair from three hours to under 30 minutes.
Scalability – supports exabyte‑scale capacity while maintaining high throughput.
The erasure‑coding scheme splits each file into 28 data fragments and generates 4 parity fragments, storing all 32 fragments on separate machines. This provides up to 16 nines of data durability with fast repair times.
Reliability calculations show that faster repair times dramatically lower the probability of data loss, especially as cluster size grows.
In conclusion, as storage systems become more complex and cost‑effective cloud services mature, building and operating proprietary storage becomes less attractive, and cloud storage will increasingly become a utility‑like infrastructure.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
