Big Data 20 min read

MaxCompute Incremental Update Architecture, Intelligent Materialized Views, and Adaptive Execution Optimizations

This article presents a comprehensive overview of MaxCompute's near‑real‑time incremental update and processing architecture, the design of Transactional Table 2.0, intelligent materialized view evolution and recommendation, as well as multi‑level adaptive execution optimizations for the SQL engine, illustrating how these innovations improve efficiency, cost, and scalability for large‑scale data workloads.

DataFunTalk
DataFunTalk
DataFunTalk
MaxCompute Incremental Update Architecture, Intelligent Materialized Views, and Adaptive Execution Optimizations

The presentation introduces MaxCompute's incremental update and processing architecture, explaining the need for both batch and near‑real‑time pipelines and how a Lambda‑style solution integrates full‑batch processing with incremental real‑time streams to reduce redundancy and improve timeliness.

It details the unified data ingestion tools, storage services, and compute engines that support both batch and incremental workloads, highlighting features such as upsert, time‑travel, bucketed storage, and automatic clustering to manage small files and optimize storage.

The design of Transactional Table 2.0 (TT2) is described, including primary‑key based upsert, ACID support, base and delta file formats, bucketed indexing, and the mechanisms for compaction and clustering to maintain performance.

Both near‑real‑time and batch write paths are covered, with Flink connectors and DataWorks integration enabling high‑concurrency writes, and the SQL engine supporting upsert and delete operations with read‑commit isolation.

Advanced data organization services such as clustering, compaction, and storage optimization are explained, showing how they reduce small‑file overhead and improve query efficiency.

Intelligent materialized views are explored, covering their evolution, partition‑penetration capability, query rewrite, automatic recommendation based on job analysis, and lifecycle management, which together simplify usage and improve performance.

The article then discusses adaptive execution in the MaxCompute SQL engine, including multi‑level adaptive optimization across the optimizer, job master, runtime workers, and operators, with dynamic plan selection, stage concurrency adjustment, and operator‑level algorithm choices.

Finally, a Q&A section addresses common questions about materialized views versus physical tables, view expiration, and join algorithm selection, summarizing the key takeaways of the session.

Big DataMaxComputeSQL engineIncremental Updatematerialized viewAdaptive Execution
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.