Big Data 8 min read

Understanding the Origins, Significance, and Construction of Data Warehouses

This article explains the historical background of databases and data warehouses, outlines why data warehouses are essential for modern enterprises, and provides a step‑by‑step guide to building a data warehouse using Kimball’s dimensional modeling approach.

Big Data Technology & Architecture
Big Data Technology & Architecture
Big Data Technology & Architecture
Understanding the Origins, Significance, and Construction of Data Warehouses

The rapid development of the Internet has led enterprises to build their own portals, servers, and user bases, storing all transactional data in enterprise‑level databases, which are essential for low‑latency CRUD operations.

1. Background of Database Creation

Databases are divided into relational (e.g., MySQL, Oracle, SQL Server) and non‑relational (e.g., Redis, HBase, MongoDB) types, and they serve as the backbone for any application architecture.

2. Background of Data Warehouse Creation

Although the concept of "big data" and the idea that "data speaks" have existed for a long time, limited hardware and immature processing frameworks delayed practical use; the advent of Hadoop and its ecosystem enabled cost‑effective horizontal scaling and efficient data management, leading to the emergence of data warehouses to standardize and exploit massive data.

3. Significance of Building a Data Warehouse

Enterprises build data warehouses and data marts primarily to provide strong data support for upper‑layer analytical applications. The key evaluation criteria include:

Performance : Fast query response and reduced I/O by abstracting common logic into reusable data models.

Cost : Lower compute and storage costs through controlled redundancy (e.g., degenerate dimensions, wide tables).

Efficiency : Improved user experience by exposing an abstracted ADS (application data) layer.

Quality : Consistent statistical definitions and reduced calculation errors.

4. How to Build a Data Warehouse

A scientific data‑warehouse model requires solid theoretical support; this article follows Kimball’s modeling methodology. The high‑level steps are:

Requirement research → Business research → Domain segmentation → Metric system construction → DIM layer processing → ODS layer → DWD layer → DWS layer → ADS layer (data marts).

Data Warehouse Model Diagram
Data Warehouse Model Diagram

DIM Layer

The dimension layer creates dimension tables, the core of a data warehouse, by extracting and processing data from various business line databases.

Identify primary dimension tables

Identify secondary dimension tables

Define dimension attributes

Normalize and denormalize as needed

Handle special and fact dimensions

ODS Layer

The Operational Data Store (ODS) extracts data from production systems such as MySQL, HBase, Oracle, or SQL Server. It must ensure data reliability and avoid issues like missing data, inaccurate data, inconsistent naming, type mismatches, unit inconsistencies, missing comments, and unclear table names.

DWD Layer

The Data Warehouse Detail (DWD) layer contains wide tables that integrate multiple business processes, including transaction‑type fact tables and cumulative snapshot fact tables, facilitating complex metric calculations.

DWS Layer

The Data Warehouse Summary (DWS) layer aggregates data for periodic metrics (e.g., monthly sales) by creating snapshot fact tables for various time windows such as hourly, daily, weekly, or monthly.

ADS Layer

The Application Data Store (ADS) layer provides the highest level of abstraction, delivering ready‑to‑use data directly to business applications without exposing underlying complexities, and can be customized into separate data marts for different business lines.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

ETLdimensional modelingKimball
Big Data Technology & Architecture
Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.