Databases 8 min read

Overview of Database Architecture, Storage Engines, and Data Layouts

This article explains the core components of database systems, including their client‑server architecture, query processing, storage engine modules, classification by storage media (memory vs. disk), and the differences between row‑oriented and column‑oriented data layouts, concluding with future topics to explore.

NetEase LeiHuo UX Big Data Technology
NetEase LeiHuo UX Big Data Technology
NetEase LeiHuo UX Big Data Technology
Overview of Database Architecture, Storage Engines, and Data Layouts

Databases primarily provide reliable data storage and make it accessible to users, serving as the main data source for applications and enabling data sharing across different parts of a system, which allows developers to focus on business logic rather than reinventing storage mechanisms.

This article gives a brief overview of database architecture and storage engines to help readers deepen their understanding of databases.

Database Architecture

Databases use a client‑server model where the database instance acts as the server and applications act as clients. Clients send requests via a transport module, typically expressed in a query language such as SQL, and the transport subsystem also handles communication with other database nodes.

Upon receiving a query, the query processor parses and validates it, performs optimization by eliminating unreachable parts, and uses statistics and data distribution to choose the most efficient execution plan. Execution plans are often cached to avoid repeated optimization.

The execution plan is carried out by the execution engine , which gathers local or remote results, possibly involving reads, writes, or data replication across nodes.

The storage engine comprises a transaction manager, lock manager, storage structures, buffer manager, and recovery manager. The following diagram illustrates the modular decomposition of a database:

Database Storage Media

Databases can be classified by storage medium into memory databases and disk databases.

Memory Databases keep data entirely in RAM, optionally persisting snapshots to disk to prevent data loss from power failures.

Disk Databases store data on HDD/SSD while using memory as a cache for recent updates, reducing disk I/O.

Using memory offers higher performance and lower access cost, but introduces volatility and higher hardware cost; mitigation techniques include UPS or battery‑backed memory.

Row‑Oriented vs. Column‑Oriented Databases

Data can be laid out by rows or columns. Row‑oriented databases store all fields of a record together, which improves locality when reading whole records and is efficient for HDD seeks, but is less optimal when only a few columns are needed.

Example of a row layout:

Column‑oriented databases store each column’s values together, which is advantageous for analytical workloads that access many rows but only a subset of columns, such as stock price analysis.

Logical column storage still groups data by tuples, but physically places column values sequentially:

To reconstruct tuples, column files store markers or offsets linking related column values, often using compressed representations to save space.

Popular columnar file formats include Apache Parquet, Apache ORC, and RCFile; column‑oriented storage systems include Apache Kudu and ClickHouse.

Conclusion

This article introduced the modular breakdown of databases, their storage media classifications, and the differences between row‑ and column‑oriented storage layouts. Future articles will cover storage implementation details, transaction mechanisms, and distributed database architectures to further deepen readers' practical knowledge.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

databaseStorage Enginedata storageColumn LayoutRow Layout
NetEase LeiHuo UX Big Data Technology
Written by

NetEase LeiHuo UX Big Data Technology

The NetEase LeiHuo UX Data Team creates practical data‑modeling solutions for gaming, offering comprehensive analysis and insights to enhance user experience and enable precise marketing for development and operations. This account shares industry trends and cutting‑edge data knowledge with students and data professionals, aiming to advance the ecosystem together with enthusiasts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.