Big Data 13 min read

Unlock Seamless BigQuery to MaxCompute Migration with dbt‑maxcompute

This article details the real‑world migration of Southeast Asian tech leader GoTerra from BigQuery to MaxCompute, showcasing how the open‑source dbt‑maxcompute adapter enables smooth ELT transitions, advanced incremental strategies, performance gains, ecosystem compatibility, and comprehensive best‑practice implementations for large‑scale data pipelines.

Alibaba Cloud Big Data AI Platform

Sep 10, 2025

Unlock Seamless BigQuery to MaxCompute Migration with dbt‑maxcompute

GoTerra, a leading Southeast Asian tech group, processes petabyte‑scale data daily. Its original data modeling relied on BigQuery + dbt.

Background and Challenge

To preserve the agile development model after moving to MaxCompute, the open‑source dbt‑maxcompute adapter was created.

dbt Philosophy: ELT Replaces ETL

Traditional ETL separates transformation logic from storage and creates performance bottlenecks. dbt promotes an ELT approach: load raw data into the warehouse and perform transformations using the warehouse’s compute power, offering simpler architecture, performance, testability, and documentation.

Incremental Strategies in dbt‑maxcompute

dbt‑maxcompute fully supports all incremental strategies available in dbt‑bigquery, providing flexible, high‑performance incremental processing.

Merge Strategy (default)

Implemented via a single atomic MERGE INTO statement, ideal for SCD Type 1 tables and deduplication.

MERGE INTO target_table AS DBT_INTERNAL_DEST
USING temp_table AS DBT_INTERNAL_SOURCE
ON (DBT_INTERNAL_SOURCE.id = DBT_INTERNAL_DEST.id)
WHEN MATCHED THEN UPDATE SET ...
WHEN NOT MATCHED THEN INSERT ...;

Insert Overwrite Strategy

Uses INSERT OVERWRITE to efficiently replace partition data, suitable for large partitioned fact tables.

INSERT OVERWRITE TABLE target_table PARTITION(date_col)
SELECT * FROM temp_table WHERE date_col IN ('...');

Other Strategies

Delete + Insert – classic fallback for non‑transactional tables.

Append – high‑performance append‑only mode for immutable event logs.

Core Practice 1: Flexible Incremental Strategy

Addresses diverse incremental needs, partition table optimization, performance‑cost trade‑offs, and ecosystem compatibility.

Core Practice 2: Enhanced Table Materialization

Leverages MaxCompute native table types such as Append Delta Table, Auto Partition Table, and Transactional Table. Configuration is expressed via the config macro:

{{ config(
    materialized='table',
    partition_by={'fields':'dt','data_types':'timestamp'},
    tblproperties={'append2.enable':'true'},
    lifecycle=90
) }}
SELECT ...

Core Practice 3: Optimized Seed Loading

Seed loading uses MaxCompute Tunnel for bulk upload and automatic type inference, achieving several‑fold speedup over row‑by‑row INSERT.

version: 2
sources:
  - name: jaffle_shop
    database: raw
    tables:
      - name: orders
        freshness:
          warn_after: {count: 6, period: hour}
          error_after: {count: 12, period: hour}

Core Practice 4: Data Freshness Monitoring

dbt‑maxcompute implements source freshness by reading the last_data_modified_time metadata, providing near‑zero‑cost freshness checks.

Core Practice 5: Third‑Party dbt Package Adaptation

Key macros from popular packages (dbt‑utils, dbt‑date, dbt‑expectations, dbt‑codegen) are rewritten to use MaxCompute‑compatible functions such as listagg, datetrunc, and datediff, allowing existing projects to migrate without code changes.

Summary and Future Outlook

dbt‑maxcompute has proven its ability to enable seamless migration, improve performance, and maintain ecosystem compatibility in PB‑scale workloads. Future work includes GA release, richer MaxCompute feature support, and open‑source community collaboration.

Transactional vs non‑transactional tables

data migration big data MaxCompute dbt ELT Incremental Strategy

Written by

Alibaba Cloud Big Data AI Platform

The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Background and Challenge

dbt Philosophy: ELT Replaces ETL

Incremental Strategies in dbt‑maxcompute

Merge Strategy (default)

Insert Overwrite Strategy

Other Strategies

Core Practice 1: Flexible Incremental Strategy

Core Practice 2: Enhanced Table Materialization

Core Practice 3: Optimized Seed Loading

Core Practice 4: Data Freshness Monitoring

Core Practice 5: Third‑Party dbt Package Adaptation

Summary and Future Outlook

Alibaba Cloud Big Data AI Platform

How this landed with the community

Was this worth your time?

0 Comments

Core Practice 1: Flexible Incremental Strategy

Core Practice 2: Enhanced Table Materialization

Core Practice 3: Optimized Seed Loading

Core Practice 4: Data Freshness Monitoring

Core Practice 5: Third‑Party dbt Package Adaptation