Why Logs Are the New Database: Shared Log Architecture in Distributed Systems
This article explores how modern distributed databases treat logs as foundational storage components, examines industry best practices from Aurora DSQL, DynamoDB, and OceanBase, and abstracts the essential properties and design considerations for building a universal, durable, and linearizable log module.
1. Introduction
At re:Invent 2023, Peter DeSantis of AWS Utility Computing famously said, “The log is the database,” a statement that resurfaced at re:Invent 2024 during Aurora DSQL and DynamoDB announcements, highlighting the growing recognition of logs as core database primitives.
2. The Essence of Logs
Jay Kreps, the creator of Kafka, described logs as an append‑only sequence of records that records what happened and when, providing the heart of distributed data systems. An append‑only log offers atomicity and durability, allowing databases to build consistency and isolation on top of it.
3. Industry Practices
Major cloud‑native databases have abstracted a dedicated log module: AWS Aurora DSQL, DynamoDB, and Ant Group’s OceanBase all employ a shared‑log design to separate compute, log, and storage.
3.1 Aurora DSQL
Aurora DSQL decouples an append‑only log from storage, enabling multi‑writer ordering, cross‑AZ strong consistency, and eliminating the single‑writer limitation of Aurora PostgreSQL.
3.2 DynamoDB
DynamoDB’s GlobalTable now supports multi‑region strong consistency by writing operations to a shared mRSC LOG; once the log is persisted in at least two regions, the write is applied and acknowledged. A special “heartbeat” request, also logged, enables strong‑consistent reads without persisting the heartbeat itself.
3.3 OceanBase
OceanBase implements PALF – a Paxos‑backed Append‑only Log File System. PALF records transactions, replicates logs to followers, and introduces innovations such as a “Reconfirmation” phase for leader election and a “Pending Follower” state to guarantee clear transaction outcomes.
4. Abstracting a Shared Log
A universal log component for distributed storage must satisfy four fundamental properties: durability, fault tolerance, unique ordering, and linearizability. Beyond these, it should support learner registration for state‑machine replication, partition‑aware I/O fencing and flushing, and high scalability to avoid bottlenecks in multi‑region deployments.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
