Data Governance Blueprint: Naming Rules, Lifecycle Levels, and Layered Architecture
Explore a comprehensive data governance guide covering naming conventions, data lifecycle classifications, layered architecture standards, inter-layer calling rules, and model design principles, providing practical standards and best practices for building robust, maintainable data warehouses and analytics platforms.
Naming Standards
Effective data governance starts with clear naming conventions. The guide defines a root‑word design standard based on business domains, covering data‑warehouse layers, periods/ranges, departments, business domains, and subject areas.
Warehouse layers: dim (common dimensions), dm , ods , dwd , dws
Period/range codes: d (daily snapshot), i (incremental), f (full), w (weekly), l (link table), a (non‑partitioned full table)
Table‑naming rules are also specified:
Regular tables:
[dwd|dws|ads]_<em>department</em>_<em>businessDomain</em>_<em>subject</em>_<em>XXX</em>_<em>period|range</em>Intermediate tables: mid_<em>xxx</em> Temporary tables: tmp_<em>xxx</em> Dimension tables:
dim_<em>xxx</em>Data Lifecycle
Historical data is classified into four levels (P0‑P3) based on importance and recoverability:
P0 : Irreplaceable core data such as transaction logs, KPI data, IPO‑related tables.
P1 : Important business and application data that cannot be lost.
P2 : Important data that can be regenerated, e.g., intermediate ETL results.
P3 : Non‑critical data that is easily recoverable, such as certain reporting tables.
The lifecycle management matrix is illustrated below:
Layered Architecture Standards
ODS (Operational Data Store): The layer closest to source data; ingest raw data without extensive cleaning.
DWD (Data Warehouse Detail): Builds detailed models on top of ODS data, organized by subject.
DWM (Data Warehouse Middle): Performs light aggregation on DWD data to create reusable intermediate tables.
DWS (Data Warehouse Service): A public aggregation layer that produces slightly coarser, topic‑oriented wide tables for analysis.
DIM (Dimension): Stores dimension attribute tables for analytical perspectives.
ADS / APP (Application Data Service): Provides data for downstream products and analytics, typically stored in ES, PostgreSQL, Redis, Hive, or Druid.
When determining a table’s layer, follow the illustrated decision flow:
Layer Call Standards
Normal flow: ODS → DWD → DWM → DWS → APP. If ODS → DWD → DWS → APP appears, the subject domain is incomplete; DWD data should be routed through DWM.
Avoid using DWS wide tables that also reference DWD tables belonging to the same subject domain.
Within a subject domain, minimize DWM tables that depend on other DWM tables to preserve ETL efficiency.
Only DWD may reference ODS tables; DWM, DWS, and APP must not directly use ODS tables.
Prevent reverse dependencies, e.g., a DWM table depending on a DWS table.
Model Design Principles
High cohesion, low coupling : Group related data of the same granularity into a single logical or physical model.
Separate core and extension models : Core models contain fields for common business needs; extension models hold optional or niche fields without polluting the core schema.
Centralize common processing logic : Encapsulate shared logic in the underlying data‑scheduling layer, avoiding duplication in application layers.
Balance cost and performance : Allow moderate data redundancy to improve query speed, but avoid excessive duplication.
Ensure data rollbackability : Processing logic should be idempotent, yielding consistent results across multiple runs.
Maintain consistency : Identical fields must share the same name across all tables.
Clear, understandable naming : Table names should be self‑descriptive for downstream users.
Accept model limitations : No single model satisfies all requirements; follow the typical design sequence: conceptual → logical → physical.
Big Data Tech Team
Focuses on big data, data analysis, data warehousing, data middle platform, data science, Flink, AI and interview experience, side‑hustle earning and career planning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
