Fundamentals 4 min read

Understanding Distributed Storage: File, Object, Block, and Key‑Value Systems Explained

This article explains the core concepts and architectures of distributed storage, covering file‑based systems like HDFS, object storage such as Ceph, block storage for high‑performance workloads, and key‑value stores like Redis and Cassandra, highlighting their use cases and design principles.

Mike Chen's Internet Architecture
Mike Chen's Internet Architecture
Mike Chen's Internet Architecture
Understanding Distributed Storage: File, Object, Block, and Key‑Value Systems Explained

Distributed file storage is the core form of distributed storage, targeting file read/write scenarios. Data is stored as files across nodes, providing a unified file system view. The most typical example is HDFS (Hadoop Distributed File System). Its core architecture includes a NameNode (metadata server) and DataNodes (storage nodes), suitable for massive file storage, high‑throughput sequential reads/writes, and good scalability.

Distributed object storage treats each object as the smallest management unit, storing data and metadata together. Objects are accessed via a globally unique identifier. Common implementations include Ceph. This decentralized architecture has no single metadata node; all nodes can handle requests. It is suitable for internet applications, cloud storage, backup/archive, CDN, and especially for massive unstructured data such as images, videos, and logs.

Distributed block storage provides raw storage devices in fixed‑size blocks, similar to disk partitions, which upper‑layer file systems or databases manage directly. Clients combine blocks through a virtualization layer to present a block device interface like a hard disk. It is ideal for high‑performance, low‑latency databases and virtual machine storage; typical implementations include Ceph RBD.

Distributed key‑value storage stores data as key‑value pairs or column families, emphasizing low latency and high concurrency. Typical examples are Redis (in‑memory KV), Cassandra, HBase, Dynamo, and TiKV. The data model is simple: Key and Value. Elastic scaling relies on consistent hashing and similar algorithms, allowing easy node addition or removal with automatic data migration and load balancing.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

file systemdistributed storageHDFSCephobject storageblock storagekey-value store
Mike Chen's Internet Architecture
Written by

Mike Chen's Internet Architecture

Over ten years of BAT architecture experience, shared generously!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.