Databases 12 min read

From Paper Tape to Distributed Cloud‑Native Databases: Evolution and Future

This article traces the history of data management from manual storage and early file‑system models through the relational revolution to modern distributed databases, covering key concepts like CAP theory, distributed transactions, HTAP, and cloud‑native deployment trends.

Xiaolei Talks DB
Xiaolei Talks DB
Xiaolei Talks DB
From Paper Tape to Distributed Cloud‑Native Databases: Evolution and Future

Data Management Technology Emergence and Development

Before discussing distributed databases, let's look at the origin of databases. My initial understanding came from the textbook "Database System Concepts" (Wang Shan, Sa Shixuan edition).

Manual data management Before the 1950s, data was stored on paper tape, cards, magnetic tape, mainly for scientific calculations, not long‑term storage. Programmers designed, managed, and adapted code whenever data changed, resulting in strong coupling between data and programs.

File‑system data management In the 1950‑60s, disks became available and hierarchical and network data models emerged. Hierarchical models handled 1‑to‑1 and 1‑to‑N relationships efficiently but struggled with N‑to‑1; network models described complex relationships but were hard to implement.

Relational model birth and database growth In the 1970s, IBM researcher E.F. Codd introduced the relational model, a normalized two‑dimensional table with concepts such as table name, tuple, attribute, and key. The 1980s saw the first commercial relational databases like Oracle, DB2, and SQL Server, followed by MySQL and PostgreSQL in the 1990s. After 2000, increasing data volumes exposed the limits of single‑node databases, leading to sharding and middleware solutions.

Big Data Drives Distributed Database Emergence

Google’s 2006 papers on GFS, BigTable, and MapReduce laid the foundation for the Hadoop ecosystem and later Spanner and F1, which introduced globally distributed transactions and scalable SQL services.

These theories spurred the creation of many NoSQL and distributed relational databases.

Elements of distributed databases A distributed database connects multiple physical nodes into a logical whole, managed by a distributed DBMS. Key elements include scalability (smooth, automated rebalancing), consistency (financial‑grade ACID), and availability (tolerating network partitions or node failures).

CAP theorem The CAP theorem states that a distributed system cannot simultaneously guarantee Consistency, Availability, and Partition tolerance. To balance these, systems may adopt weaker guarantees such as BASE.

Consistency (all nodes see the latest data)
Availability (every request receives a response)
Partition tolerance (system continues despite network partitions)

Distributed transactions Examples include two‑phase commit (2PC), SAGA, and TCC, each offering different trade‑offs for atomicity across nodes.

Types of distributed databases “Pseudo” distributed databases use MySQL as storage with middleware for routing (e.g., Cobar, Mycat).

Shared‑storage distributed databases separate compute and storage, offering dynamic scaling (e.g., AWS Aurora, Alibaba PolarDB).

Decentralized distributed databases adopt a share‑nothing architecture with consensus algorithms like Multi‑Paxos or Raft (e.g., PingCAP TiDB, Ant Group OceanBase).

Future of Distributed Databases

At the recent China Database Conference, the trend toward HTAP (Hybrid Transactional/Analytical Processing) and cloud‑native deployment was evident.

HTAP aims to serve both OLTP and OLAP workloads in real time, reducing the need for separate ETL pipelines.

Cloud‑native databases leverage Kubernetes (statefulsets, operators, local PVs) to run stateful workloads reliably on private or public clouds, and many vendors already provide operators for this purpose.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

cloud nativeCAP theoremHTAPTransaction Management
Xiaolei Talks DB
Written by

Xiaolei Talks DB

Sharing daily database operations insights, from distributed databases to cloud migration. Author: Dai Xiaolei, with 10+ years of DB ops and development experience. Your support is appreciated.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.