Databases 15 min read

How Facebook Scaled Its Data Storage with NoSQL: Cassandra, HBase, and Beyond

This article traces Facebook's evolution from a small social site to a global platform, explains how its massive data‑storage challenges led to the adoption of NoSQL solutions like Cassandra and HBase, and breaks down the core patterns, consistency models, and scaling techniques that power such large‑scale systems.

dbaplus Community
dbaplus Community
dbaplus Community
How Facebook Scaled Its Data Storage with NoSQL: Cassandra, HBase, and Beyond

Facebook's Data Storage Evolution

From its early days as a regional social network to becoming the world’s largest platform, Facebook repeatedly overhauled its storage architecture to handle ever‑growing data volumes and scalability demands.

Key milestones include:

2008: Chat feature scaled to 70 million users using Erlang.

2010: Real‑time messaging stored >135 billion messages/month with HBase.

2011: Real‑time analytics processed 20 billion events per day using HBase.

Why Cassandra and HBase Were Created

Both systems address a common set of scalability problems:

Making the application layer stateless by separating it from the data layer.

Extending the data layer to support massive workloads.

Balancing load across machines and handling node failures.

Providing data replication, consistent synchronization, and automatic scaling.

NoSQL emerged as the overarching solution, with Cassandra and HBase as concrete implementations of the NoSQL concept.

Basic NoSQL Pattern Concepts (Core Content)

Common NoSQL products include Google Bigtable, Amazon Dynamo, and Cassandra (which combines Dynamo’s distributed design with Bigtable’s data model). Their shared characteristics are:

Key‑value storage.

Running on large numbers of inexpensive commodity servers.

Data partitioned and replicated across nodes.

Relaxed consistency requirements.

The architecture can be broken down into several layers:

A. API Model (CRUD Operations)

Standard database operations: create, read, update, delete.

B. Underlying Architecture

Hundreds to thousands of physical nodes (PN) each host several virtual nodes (VN). VN distribution enables flexible data placement.

C. Partitioning

Two main methods:

Simple hash: partition = key mod total_VN (inefficient when VN count changes).

Consistent hashing: keys are placed on a ring; each VN occupies a segment, minimizing data movement on node joins/leaves.

D. Replication

Improves reliability.

Distributes workload across replicas.

E. Node Membership Changes

When a node joins, it receives data from neighboring nodes and propagates its presence; when a node crashes, neighbors update membership and asynchronously copy missing data.

F. Client Consistency Models

Strict Consistency (one‑copy serializability).

Read‑Your‑Write.

Session Consistency.

Monotonic Read.

Eventual Consistency.

G. Master‑Slave Model (Single Master)

All requests go through a master; slaves handle reads. If the master fails, a slave is promoted.

H. Multi‑Master Model (No Master)

Updates are coordinated via two‑phase commit (2PC) or quorum‑based 2PC (Paxos), reducing coordination bottlenecks.

I. Gossip Protocol

For weaker consistency requirements, gossip spreads state information between replicas in a peer‑to‑peer fashion, similar to rumor spreading.

J. Storage Implementation

Pluggable back‑ends: MySQL, filesystem, large hash tables.

Copy‑on‑modify: each update creates a new version, propagating changes up the index hierarchy.

NoSQL Summary

To build a scalable NoSQL system you must:

Partition data across machines using consistent hashing.

Handle node failures by copying data to neighbors.

Choose a replication model (master‑slave or master‑less).

Synchronize replicas via Paxos, quorum‑based 2PC, or gossip.

Rebalance data when the cluster size changes.

Understanding these patterns makes reading Cassandra or HBase documentation much clearer and shows why NoSQL is critical for large‑scale services like Facebook.

Reference URLs: http://horicky.blogspot.com/2009/11/nosql-patterns.html, http://horicky.blogspot.com/2010/10/bigtable-model-with-cassandra-and-hbase.html, http://www.oracle.com/technetwork/cn/articles/cloudcomp/berkeleydb-nosql-323570-zhs.html, https://www.quora.com/What-is-the-difference-between-gossip-and-Paxos-protocols, http://blog.csdn.net/cloudresearch/article/details/23127985, https://www.quora.com/Why-use-Vector-Clocks-in-a-distributed-database

NoSQL product comparison
NoSQL product comparison
NoSQL architecture diagram
NoSQL architecture diagram
Physical and virtual node layout
Physical and virtual node layout
Node failure and recovery
Node failure and recovery
Quorum‑based 2PC
Quorum‑based 2PC
Gossip protocol illustration
Gossip protocol illustration
Copy‑on‑modify storage
Copy‑on‑modify storage
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

HBaseConsistencyNoSQLFacebookcassandra
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.