How MMS Powered a 50 PB BigQuery‑to‑MaxCompute Migration for GoTerra
This article details GoTerra's massive six‑month, 50 PB migration from GCP BigQuery to Alibaba Cloud MaxCompute, covering project scope, technical challenges such as complex data types, partition strategies, and high‑speed requirements, and explaining how the MaxCompute Migration Service (MMS) solved them with innovative architecture, scheduling, and data‑reorder techniques.
Project Background and Challenges
GoTerra, a leading Southeast Asian tech group, completed a cross‑continent migration of its final 19.5 GB BigQuery partition to MaxCompute, moving a total of 50 PB of data in six months. The migration spanned more than ten company entities, dozens of projects, and over 100 000 tables, requiring both full‑load and incremental heterogeneous migration.
Data scale: ~50 PB (BigQuery logical size)
Data objects: ~70 projects, 100 000 tables
Business scope: 10+ accounts, each representing a subsidiary
Timeline: Started Jan 2025, completed Jun 2025
Technical challenges included complex data types (nested arrays, structs, decimals), diverse partitioning strategies, high migration speed (≥2 PB per week), and flexible scheduling to meet varied business priorities.
Object Migration Scheme
2.1 Data Object Mapping
BigQuery’s three‑level model (Project → Dataset → Table) maps directly to MaxCompute’s (Project → Schema → Table).
Source Object → Target Object
BigQuery Project → MaxCompute Project
BigQuery Dataset → MaxCompute Schema
BigQuery Table → MaxCompute Table2.2 Data Type Mapping
Key type conversions include:
Boolean → Boolean
Bytes(L) → Binary(L)
Date → Date
Datetime → Timestamp_NTZ (precision 10⁻⁶ s)
JSON → JSON
INT64 and its aliases → bigint
NUMERIC → Decimal(38,9) (compatible with MaxCompute)
BIGNUMERIC → Decimal(p,s) (custom precision required)
Range, Geography, Interval → String (no native support)
Struct → Struct (preserved)
Array → Array (JSON arrays converted to Array with JSON strings as elements)
2.3 Partition Strategy Mapping
BigQuery’s rich partitioning options are emulated in MaxCompute using auto‑partitioned pseudo‑columns.
Integer range partitioning: Not supported in MaxCompute; omitted.
Time‑unit column partitioning: Implemented via
AUTO PARTITIONED BY (trunc_time(d, 'day') AS _partition_value)for daily partitions and similar for monthly.
Ingestion time partitioning: Implemented with auto‑partitioned pseudo‑columns storing the insert timestamp.
-- BigQuery DDL (daily)
CREATE TABLE test_table (id INT64, d DATE) PARTITION BY d;
-- MaxCompute DDL
CREATE TABLE test_table (id BIGINT, d DATE) AUTO PARTITIONED BY (trunc_time(d, 'day') AS _partition_value);
-- BigQuery DDL (hourly ingestion)
CREATE TABLE test_table (a STRING) PARTITION BY TIMESTAMP_TRUNC(_PARTITIONTIME, HOUR);
-- MaxCompute DDL
CREATE TABLE test_table (a STRING, _partitiontime TIMESTAMP_NTZ) AUTO PARTITIONED BY (trunc_time(_partitiontime, 'hour') AS _partition_value);Migration Architecture
Two solution options were evaluated:
Solution 1: Dump BigQuery tables to GCS, transfer files to OSS, then import via MaxCompute external tables.
Solution 2: Use BigQuery Read API with MaxCompute Spark to read directly from BigQuery and write to MaxCompute.
After weighing usability, cost, and architectural simplicity, MMS selected Solution 2.
Key Migration Technologies
4.1 Speed Optimization
Dedicated 50 PB/s network link.
Compressed reads from BigQuery.
Thousands of CU compute resources on Spark ready on demand.
These optimizations achieved near‑full link utilization and peak daily throughput of 1.92 PB.
4.2 Migration Atomicity
Initial approach used
DROP IF EXISTS partition; CREATE IF NOT EXISTS partition; INSERT INTO … SELECT …, causing visible gaps. MMS switched to INSERT OVERWRITE … SELECT …, leveraging MaxCompute’s statement‑level atomicity.
BEGIN;
DROP IF EXISTS partition;
CREATE IF NOT EXISTS partition;
INSERT INTO target_table SELECT …;
COMMIT;4.3 Priority Scheduling
Customers defined urgency via ETA (Estimated Time of Arrival) rather than numeric priority. A global ETA‑based scheduler ensures urgent tasks across projects are executed first, reducing average completion time for urgent tasks to under 30 minutes.
4.4 Column‑Reorder‑Fillback
During dual‑run migration, streaming tables may diverge and column order can differ. MMS performs fillback by reordering columns for basic types and recursively rotating sub‑fields for nested structs.
CREATE TABLE a (c1 STRUCT<c1_1:STRING, c1_2:STRUCT<c1_2_1:STRING, c1_2_2:STRING>, c1_3:STRING>);
CREATE TABLE b (c1 STRUCT<c1_3:STRING, c1_2:STRUCT<c1_2_2:STRING, c1_2_1:STRING>, c1_1:STRING>);4.5 Dual‑Run Incremental Optimization
During the migration’s dual‑run phase, urgent partitions must be synced within two hours. MMS upgraded its scheduler to treat ETA globally, dramatically cutting urgent task latency.
Future Plans
Full alignment with BigQuery data types, especially Geography and BigDecimal.
Support migration of Views.
Introduce smarter scheduling that dynamically senses bandwidth and reallocates resources.
Alibaba Cloud Big Data AI Platform
The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
