Big Data 12 min read

Yanxuan’s Data Warehouse Blueprint: Architecture, Standards, and Evaluation

This article introduces Yanxuan’s data warehouse concept, platform layers, development standards, and a comprehensive evaluation framework, detailing its multi‑layer architecture (ODS, DWD, DWS, DIM, DM), supporting offline and real‑time platforms, and six key assessment dimensions such as data quality, security, and development efficiency.

Yanxuan Tech Team
Yanxuan Tech Team
Yanxuan Tech Team
Yanxuan’s Data Warehouse Blueprint: Architecture, Standards, and Evaluation

Data warehouses are intangible products for data engineers, with evaluation criteria distinct from visualization or interactive products. This article presents Yanxuan’s data warehouse concept, platform, and evaluation system.

Data Warehouse Basic Architecture

Yanxuan’s warehouse follows a layered logic. The overall framework is shown below:

The layers are divided by business data flow into three main tiers: ODS (Operational Data Store), DW (including DWD and DWS), and DM (Data Mart).

ODS Layer (Operational Data Store) : Not exposed externally; synchronizes raw business system data to the warehouse, preserving original formats, primarily via DataHub parsing binlog with full‑load sync.

DWD Layer (Detail Layer) : Exposed externally; stores common logic and frequently used dimension attributes, creating wide tables to reduce joins.

DWS Layer (Summary Layer) : Exposed externally; contains core public metrics and serves as the main data asset for external use.

DIM Layer (Dimension Tables) : Exposed externally; includes common dimension tables such as product, SKU, and channel.

DM Layer (Application Layer) : Exposed to products; supports data products and reports, aggregating complex metrics.

Data Warehouse Development Platform

Yanxuan’s warehouse consists of offline and real‑time components.

Offline processing is supported by Mammoth, a one‑stop data management and application development platform from NetEase Hangzhou Research Institute.

Real‑time processing is provided by the Atom platform, a self‑developed real‑time data management and development solution.

Yanxuan Data Warehouse Standards

Although data warehouses are often seen as low‑entry‑barrier SQL work, Yanxuan follows a rigorous methodology comprising three specifications: Metric Definition, Model Design, and Data Development, supported by tools such as Cangjie (metric management), SuiRen (metric map), UDS (data quality), and EasyDesign (model design).

Data Warehouse Evaluation System

Core requirements emphasize data security and data quality as the warehouse’s lifeline.

1. Data Specification

Improves overall development quality by enforcing the three Yanxuan specifications and monitoring their implementation.

2. Data Security

Adheres to NetEase Business Conduct Guidelines, preventing external data leaks and ensuring secure handling.

3. Data Quality

Consists of data‑intrinsic quality (measured by fault levels and frequency) and construction quality (usability and richness of core assets).

4. Data Stability

Ensures both warehouse and platform stability through duty rosters, integrated incident platforms, and regular reviews.

5. Continuous Construction Mechanism

Maintains vitality via regular analyst‑driven metric updates and governance that removes non‑standard models, saving storage.

6. Data Development Efficiency

Measured by automation of development standards and platform user experience; recent projects reduced iteration cost dramatically.

Yanxuan Data Warehouse Evaluation Practice

1. Data Specification

Implemented via the EasyDesign platform, which automates metric definition and model design, supporting over 200 new DW tables in six months.

2. Data Security

Addresses data‑related losses through compliant release processes, testing tools, and environments.

3. Data Stability

EasyTaskOps provides intelligent baseline alerts and fine‑grained operations; baseline completion rates exceed 90%.

4. Continuous Construction Mechanism

Through the EasyCost upgrade, rule‑based storage, governance, and compute optimizations saved 1.2 PB of storage.

5. Data Development Quality

EasyDesign ensures compliance, leading to over 200 new DW tables built with high quality.

6. Data Development Efficiency

Standardized processes and platform support dramatically reduced iteration and repair costs for offline and real‑time data validation.

Conclusion

Yanxuan’s data warehouse has accumulated extensive experience across six dimensions, from product to practice. The team completed data standards and SOPs in Q3 2019, advanced product iterations in early 2020, and expects richer data, easier usage, stronger guarantees, and faster response in the latter half of the year.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

evaluation metricsBig Data Architecture
Yanxuan Tech Team
Written by

Yanxuan Tech Team

NetEase Yanxuan Tech Team shares e-commerce tech insights and quality finds for mindful living. This is the public portal for NetEase Yanxuan's technology and product teams, featuring weekly tech articles, team activities, and job postings.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.