Comprehensive Guide to Big Data Modeling and Data Warehouse Design
This article provides an in‑depth overview of big‑data modeling concepts, covering why data modeling is essential, relational versus analytical systems, common warehouse modeling methodologies, Alibaba's practical implementations, dimension design techniques, and detailed fact‑table design principles for modern data platforms.
Chapter 1 introduces the need for data modeling in big‑data environments, emphasizing structured classification, cost reduction, efficiency improvement, and data quality, and contrasts OLTP (transaction‑oriented, 3NF) with OLAP (analysis‑oriented) systems.
It then surveys typical warehouse modeling approaches such as ER modeling, dimensional modeling (star and snowflake schemas), Data Vault, and Anchor models, highlighting their purposes and trade‑offs.
Chapter 2 describes Alibaba's data integration and management framework, detailing the layered architecture of ODS (operational data store), CDM (common dimension model) with DWD (detail) and DWS (summary) layers, and ADS (application data), together with principles of high cohesion, low coupling, cost‑performance balance, and naming consistency.
Chapter 3 focuses on dimension design, explaining basic concepts (facts vs dimensions, attributes, primary keys), design steps (selecting dimensions, defining granularity, identifying attributes), consistency and integration strategies, hierarchical and recursive dimensions, behavioral and multi‑value dimensions, and special cases such as micro‑dimensions.
Chapter 4 covers fact‑table fundamentals, classifying transaction, periodic snapshot, and cumulative snapshot fact tables, outlining design principles (granularity declaration, completeness, additivity, null handling, degenerated dimensions), and comparing single‑transaction versus multi‑transaction fact tables, including aggregation strategies, storage considerations, and implementation patterns used at Alibaba.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Big Data Technology & Architecture
Wang Zhiwu, a big data expert, dedicated to sharing big data technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
