Understanding Granularity in Data Warehouse Design
This article explains the concept of granularity in data warehouse design, describing data models composed of structures, operations, and constraints, illustrating how granularity affects storage detail, query performance, and resource consumption, and recommending a dual‑granularity approach to balance efficiency and analytical depth.
In the previous two lessons of the Data Warehouse mini‑class, the basics of data warehouses and the use of metadata for managing them were introduced; this lesson focuses on the most important aspect of data warehouse design: granularity.
Data warehouse design starts with a data model, which consists of three parts: data structure, data operations, and data constraints. The data structure describes data types, contents, properties, and relationships; data operations define the types and methods of actions on those structures; data constraints enforce naming standards and element specifications to ensure correctness and compatibility.
Images illustrating the data structure, operations, and constraints in the author's project are shown below.
Granularity refers to the level of detail of data units stored in the warehouse. High‑granularity data retain many detailed fields (e.g., user ID, purchase time, product name, category, price), while low‑granularity data keep only aggregated information (e.g., user ID and total yearly spend).
Although low‑granularity data seem more versatile, storing and querying large volumes of detailed records consumes significantly more resources and time; aggregating to a higher granularity can reduce query latency and storage costs.
A balanced approach is to maintain dual granularity levels: keep frequently accessed high‑granularity data in fast, expensive storage, and store less‑frequent low‑granularity data in slower, cost‑effective storage, continuously adjusting based on access patterns.
The author's project does not enforce a strict granularity standard; engineers must manually balance detail and performance when loading data into the warehouse.
360 Quality & Efficiency
360 Quality & Efficiency focuses on seamlessly integrating quality and efficiency in R&D, sharing 360’s internal best practices with industry peers to foster collaboration among Chinese enterprises and drive greater efficiency value.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.