Databases 14 min read

Why Multi-Model Databases Are the Future of Cloud Data Management

The article explains how cloud-driven demands and diverse data types have spurred the rise of multi-model databases, detailing their architecture, storage structures, compression techniques, and access methods using SequoiaDB as a concrete example.

dbaplus Community
dbaplus Community
dbaplus Community
Why Multi-Model Databases Are the Future of Cloud Data Management

1. Cloud‑driven demand for Multi‑Model databases

Modern cloud‑native applications generate structured (relational), semi‑structured (JSON/XML) and unstructured (images, video, documents) data. Maintaining dozens of separate database services in a dbPaaS increases operational overhead and data‑consistency risk. A Multi‑Model database that natively supports all data types on a single platform reduces complexity and cost.

Cloud database multi‑model diagram
Cloud database multi‑model diagram

2. Multi‑Model storage engine architecture

Two architectural patterns address heterogeneous data:

Polyglot Persistence – deploy multiple specialized databases side‑by‑side. Each workload gets optimal performance, but the system incurs higher deployment, monitoring and schema‑management complexity.

Multi‑Model database – embed several storage engines inside a single distributed database, exposing a unified API and metadata layer. This simplifies development, deployment and backup.

SequoiaDB follows the second pattern: a single distributed engine hosts relational tables, JSON documents, object data and full‑text indexes simultaneously.

Multi‑Model engine architecture diagram
Multi‑Model engine architecture diagram

3. Storage data structures

SequoiaDB stores all data as BSON (Binary JSON) documents. BSON retains JSON’s hierarchical model while adding binary types (Date, BinData, etc.) and a compact binary layout that enables fast traversal and schema‑less storage.

BSON structure example
BSON structure example

Physical storage is organized as files → pages → extents . Logical containers are:

Collection Space – a group of files that isolates a set of collections.

Collection – a logical container for BSON documents, analogous to a table.

Document – a single BSON record stored inside a collection.

Each collection consists of a linked list of extents; an extent is a linked list of pages. When a collection exhausts its current extent, the engine allocates a new extent and links it, allowing continuous growth without pre‑allocation.

3.1 Structured and semi‑structured data

Structured data (fixed schema) and semi‑structured data (self‑describing JSON/XML) coexist in the same collection. Because BSON is schema‑less, fields can be added, removed or changed on a per‑document basis without schema migrations.

3.2 Unstructured data (LOB)

Large objects (LOBs) such as images, videos or PDFs are managed by a dedicated LOB subsystem. When a LOB is written, the engine:

Assigns a globally unique OID.

Splits the binary payload into fixed‑size shards (default 512 KB).

Hashes each shard (OID + sequence) to select a target partition group.

Stores shard metadata in a LOBM file and the raw shard data in a LOBD file.

Reading a LOB requires fetching the OID, locating the shard with sequence=0 (which holds the LOB’s size, creation time, etc.), then retrieving all subsequent shards in order and reassembling them.

LOB file logical structure
LOB file logical structure

3.3 Data access and compression

SQL interface – SequoiaDB implements PostgreSQL/MySQL‑compatible protocols, allowing existing SQL applications to connect without code changes.

Native APIs – Drivers for C, C++, Java, Python, Go, Node.js and other languages provide direct collection and document operations.

Compression – Row‑level compression uses Snappy (dictionary‑free, fast) while table‑level compression uses LZW (dictionary‑based) to reduce storage footprint and improve I/O throughput.

SequoiaFS – A POSIX‑style file system built on FUSE maps LOB collections to a virtual directory hierarchy, enabling standard file operations (open, read, write, delete) on distributed LOB data.

4. Summary

Multi‑Model databases such as SequoiaDB provide a unified storage engine that can manage relational, JSON, object and full‑text data together, while offering SQL compatibility, language‑native APIs, built‑in compression and a FUSE‑based file system for unstructured data. This architecture aligns with cloud‑native requirements for scalability, operational simplicity and cost efficiency, and reflects the broader industry trend of extending traditional relational systems with native JSON and LOB support.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Database Architecturedata storagemulti‑modelCloud DatabasesSequoiaDBBSON
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.