Fundamentals of Data Storage: Engines, Models, Transactions, Distributed Design, and Redundancy
This article explains the importance of data storage, describes single‑node storage engines and data models, outlines transaction and concurrency control, and covers distributed storage principles, CAP and FLP theorems, 2PC and Paxos protocols, as well as redundancy, backup, and failover mechanisms.
Data Storage Importance: Data is the most valuable asset of an enterprise, and its reliability is essential.
Single‑Node Storage Principles:
Storage Engine: the core component that determines functionality and performance.
Engine Types: Hash Engine – based on hash tables (array + linked list); supports Create/Update/Delete and random reads. B‑Tree Engine – based on B‑Tree; supports CRUD on single records and sequential scans; widely used in RDBMS. LSM Engine – writes are first buffered in memory and later flushed to disk in batches; excels at write‑heavy workloads but requires compaction for reads.
To avoid memory data loss, modifications are written to a CommitLog.
Data Models:
File system – organized as directory trees (Linux, macOS, Windows).
Relational – tables with rows and columns.
Key‑Value – e.g., Memcached, Tokey, Redis.
Column‑oriented – e.g., Cassandra, HBase.
Graph databases – e.g., Neo4j, InfoGrid, Infinite Graph.
Document stores – e.g., MongoDB, CouchDB.
Transaction and Concurrency Control:
ACID properties: Atomicity, Consistency, Isolation, Durability.
Concurrency control: Lock granularity: Process → DB → Table → Row. Read‑only concurrency via copy‑on‑write or MVCC.
Data recovery is performed through operation logs.
Multi‑Node Storage Principles:
The single‑node principles still apply; multi‑level storage builds on them.
Data distribution across nodes with load balancing. Static partitioning: e.g., modulo, uid%32. Dynamic partitioning: consistent hashing, handling data drift when nodes fail and recover.
Replication: multiple copies across nodes ensure high reliability and availability, using CommitLog.
Failure detection via heartbeat, data migration, and recovery mechanisms.
FLP Impossibility Theorem: In an asynchronous messaging environment, even with a single process failure, no algorithm can guarantee that non‑failed processes reach consensus.
CAP Theorem: Consistency, Availability, and Partition Tolerance cannot all be achieved simultaneously; a trade‑off is required, and distributed storage must provide automatic fault tolerance.
Two‑Phase Commit (2PC) Protocol:
Used for distributed transactions.
Roles: one coordinator and multiple participants.
Phase 1 – Prepare: coordinator asks participants to prepare; all must agree.
Phase 2 – Commit: after receiving all decisions, the coordinator decides to commit or abort and notifies participants.
2PC is blocking; participants or coordinator failures require timeout handling, logging, and standby coordinators.
Paxos Protocol:
Solves consistency among nodes; elects a new leader if the primary fails.
Roles: Proposer and Acceptor.
Steps: Prepare: Proposer sends accept request to Acceptors. Accept: If a majority of Acceptors accept, the proposal is chosen and the Proposer notifies all Acceptors.
Compared with 2PC: 2PC guarantees atomicity across shards; Paxos guarantees consistency among replicas of a single shard.
Typical uses: global lock services (e.g., Apache Zookeeper) and multi‑datacenter replication (e.g., Google Megastore).
Storage Layer Redundancy:
Multiple replicas provide high availability.
Implementation methods: Log‑based replication. Master‑Slave (MySQL, MongoDB). Replica Set (MongoDB).
Dual‑write architectures allow multi‑master peer‑to‑peer structures, offering flexibility at higher cost.
Backup Strategies:
Cold backup: periodic copy to external media; simple and cheap but may be inconsistent and slow to restore.
Hot backup: online backup providing higher availability. Asynchronous hot backup: writes return to the application immediately, with replication occurring later. Synchronous hot backup: all replicas write synchronously; the slowest server determines response latency.
Failure Failover Mechanism:
Failure detection via heartbeat to confirm node outage.
Access redirection routes traffic to healthy nodes while keeping data consistent.
Data recovery uses master‑slave replication and logs.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.