Alibaba Cloud DataWorks Intelligent Data Modeling: Practices and Insights
This article introduces Alibaba Cloud DataWorks' intelligent data modeling tool, outlines the data demand flow, shares best practices and practical demonstrations of data warehouse modeling, discusses model application and data asset management, and answers common questions while highlighting its commercial availability.
DataWorks, Alibaba Cloud's big data development governance platform, has evolved for 14 years and recently launched the Intelligent Data Modeling tool at the 2021 Cloud Expo. The tool is built with contributions from Alibaba's internal data warehouse teams such as Cainiao, Taobao, and Tmall.
1. Alibaba Data Demand Flow
Roles involved in data warehouse construction include data demand owners (operations, BI, product managers), data product managers who translate business needs into data requirements, and data development engineers responsible for designing data models and metrics.
2. Data Warehouse Modeling Best Practices
Business classification: based on Kimball dimensional modeling, adding a "business classification" layer to separate models by business team.
Data domains: define domains by aggregating business processes and key entities.
Data marts: include business, product, and public marts in the application layer, with optional modeling based on business needs.
Standardization of naming conventions and storage strategies is enforced through built‑in templates and validation checks.
3. Practical Demonstration of Data Warehouse Modeling
The demo covers four aspects: warehouse planning, data standards, metric design, and dimensional modeling. It shows how to batch generate derived metrics, create DWD tables by importing ODS structures, and handle field redundancy for efficient model design.
Code mode supports MaxCompute DDL and Hive DDL, and can generate ETL code from SELECT statements.
4. Data Model Application and Data Asset
After models are materialized, they can be published to the Data Asset catalog, enabling zero‑code SQL analysis and field selection. The Data Asset 3D panorama visualizes the enterprise's data assets for better governance.
5. Q&A
Q: Does DataWorks support generating slowly changing dimension (SCD) tables? A: Automatic SCD generation is not yet publicly available. Q: How is data asset sharing handled? A: Administrators publish assets to the Data Asset module; sharing between business users is currently done via direct product links.
DataWorks Intelligent Data Modeling is commercially available on Alibaba Cloud, with a personal version priced at 60 CNY for six months, including a retail e‑commerce template and tutorials.
For more details, visit: https://www.aliyun.com/product/bigdata/ide
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.