Big Data 15 min read

Building a Unified Data Warehouse for Moving Services: Boosting Efficiency and Data Quality

This article details the challenges of fragmented ODS data in the moving‑service domain and explains how a dedicated public‑layer data warehouse, with layered architecture and quality monitoring, was designed and implemented to improve data reuse, reduce redundancy, and stabilize downstream analytics.

Huolala Tech

Mar 2, 2023

Building a Unified Data Warehouse for Moving Services: Boosting Efficiency and Data Quality

1. Introduction

1.1 Background

Previously, most of the data for the "Worry‑Free Moving" service was ingested from business databases into the ODS layer by data‑warehouse engineers, after which downstream analysts queried the ODS directly for calculations. This approach lacked a public data layer, leading to several problems:

Directly querying the ODS caused large downstream impact when business side changes the data model.

Downstream analysts each depended on raw ODS data, resulting in scattered data pipelines that were hard to manage.

Inconsistent data quality from the business side could pollute downstream analyses.

Repeated calculations across downstream teams wasted resources due to missing shared metrics.

1.2 Solution

To address the above issues, a public‑layer data‑warehouse model for the "Worry‑Free Moving" domain was built, covering themes such as moving orders and mover (courier) information. By consolidating core fields and common metrics in the public layer (dwb & dws), downstream teams can directly rely on these tables, achieving:

Minimizing the direct impact of business‑side changes on downstream analysis, improving stability.

Increasing efficiency for BI and operations analysts by enhancing data usability and reusability.

Providing core fields and common metrics in shared tables to avoid duplicate calculations.

Embedding data‑quality safeguards through the Dayu data‑governance platform’s monitoring and alert mechanisms.

Consolidating scattered data tasks, optimizing lineage, and establishing a clear data architecture.

2. Data Warehouse Construction

The following sections describe the practical implementation of the data‑warehouse for the "Worry‑Free Moving" business, illustrating how a data‑warehouse engineer builds a data mart from scratch to support downstream analytics and ensure model robustness.

2.1 Overview of Data Warehouse

A data warehouse is organized around business themes, following a vertical layered development and horizontal domain partitioning architecture, which helps visualize data assets at a macro level.

2.2 Construction Steps

2.2.1 Business Research

Data‑warehouse developers act as a bridge between business development and downstream analysts. Early in the project, they must understand upstream business processes and downstream metric requirements, align data definitions, and design the warehouse tables accordingly.

Overall Business Process of the Moving Service

The workflow starts with inviting potential movers, conducting interviews at training centers, registering and depositing a guarantee to become a certified mover. Movers can be team leaders or members, receive orders, and may later exit the platform after refunding the guarantee. For users, a moving order is similar to a freight order, with the key difference being the inclusion of moving services and the mover entity.

2.2.2 Domain Partitioning

A domain groups tightly related data subjects based on business analysis perspectives. Partitioning methods include:

By business process (e.g., product, transaction, logistics).

By business department (e.g., middle‑office, operations, supply chain).

By business system (e.g., moving system, ERP system).

Because the moving service operates as an independent line with its own systems and databases, the third method is used: a dedicated moving‑service domain is added to the enterprise data warehouse, avoiding cross‑domain conflicts.

Summary : Domain partitioning should be stable and comprehensive, allowing future business extensions without major redesign.

2.2.3 Topic Partitioning

Topics represent high‑level analytical objects within a domain, such as orders, users, or movers. Identifying these objects enables the creation of focused analytical tables.

2.2.4 Output Bus Matrix

The matrix combines business processes and dimensions to form a structured data model.

2.2.5 Layered Design and Model Table Development

Each layer of the warehouse has distinct responsibilities. While naming conventions vary across companies, understanding each layer’s purpose is essential.

Huolala's Data Warehouse Layered Design

ODS (Source Data Layer) : Synchronizes raw business data without transformation.

DWD (Detail Data Layer) : Stores cleaned, normalized detail tables, providing a buffer for downstream changes.

DWB (Base Wide‑Table Layer) : Creates wide tables by joining DWD details, adding tags and complex metric logic for downstream consumption.

DWS (Lightweight Summary Layer) : Holds aggregated data and common metrics derived from DWB.

Robustness Assessment

The design allows easy extension when new moving‑service business features are added, as they can be incorporated into existing domains or new topics.

Example of Model Table Development

Table Name : Moving Order Fee Base Wide Table

Design : Transform the DWD fee‑detail table from a vertical to a horizontal (wide) format, centralizing fee‑type mappings within the warehouse to improve downstream usability.

Implementation : Aggregate fees at the order_id level, generate a deduplicated array of fee identifiers, enabling downstream users to distinguish between absent fees and zero‑value fees.

Benefit : Improves data usability and reduces data volume.

2.2.6 Data Quality Assurance

Traditional pipelines only check job success/failure, ignoring data quality issues such as missing rows, misaligned fields, or unexpected nulls. Therefore, a dedicated data‑quality monitoring step is added, leveraging the internally built Dayu governance platform.

Monitoring Rule Design and Practice

Table data‑volume fluctuation monitoring to detect abnormal spikes or drops.

Table field enumeration value‑range monitoring to capture business‑side schema changes.

Core metric anomaly monitoring to spot irregularities in key indicators.

Multi‑source join sampling monitoring for wide‑table joins.

3. Conclusion

The public‑layer data warehouse for the "Worry‑Free Moving" domain now includes order and mover themes, covering thousands of downstream analytical tasks. The unified layer eliminates scattered pipelines, shields downstream analysts from business‑side changes, and ensures data quality, thereby empowering Huolala's business to grow faster and more reliably.

Author: Jiang Xinbiao, Big‑Data Warehouse Development Engineer. Formerly at Meizu, now at Huolala, responsible for building the moving‑service data warehouse and iterating on the freight‑order domain models.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data Data Quality data modeling Data Warehouse ETL

Written by

Huolala Tech

Technology reshapes logistics

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.