Tagged articles
91 articles
Page 1 of 1
DataFunTalk
DataFunTalk
May 8, 2026 · Big Data

How MaxCompute Evolves into a Data+AI Platform: Architecture, Core Capabilities, and Real-World Cases

The article explains how Alibaba Cloud's MaxCompute has been transformed into a cloud‑native Data+AI platform, detailing its layered architecture, multimodal storage, model management, hybrid compute scheduling, SQL AI functions, the MaxFrame Python framework, and several enterprise case studies that demonstrate performance gains and flexible resource orchestration.

AI integrationBig DataCloud Native
0 likes · 11 min read
How MaxCompute Evolves into a Data+AI Platform: Architecture, Core Capabilities, and Real-World Cases
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Apr 28, 2026 · Artificial Intelligence

Zero‑Learning Video to Semantic Vector Pipeline with MaxFrame’s Distributed AI Engine

Faced with exploding video volumes and bottlenecks in frame extraction, labeling, and vector storage, MaxFrame offers a three‑step, end‑to‑end distributed pipeline that turns raw videos into searchable semantic vectors while providing zero‑threshold scaling, transparent OSS mounting, row‑level fault tolerance, and elastic concurrency control.

MaxComputeMaxFrameOSS
0 likes · 6 min read
Zero‑Learning Video to Semantic Vector Pipeline with MaxFrame’s Distributed AI Engine
DataFunSummit
DataFunSummit
Apr 27, 2026 · Big Data

How MaxCompute Evolves Big Data Platforms for AI: Architecture, Core Capabilities, and Real‑World Cases

The article details MaxCompute's AI‑driven evolution, covering its multilayer architecture, multimodal storage management, SQL AI functions, the Python‑based MaxFrame framework, and several industry case studies that demonstrate performance gains and flexible resource scheduling for large‑scale AI workloads.

Data+AIMaxComputeMaxFrame
0 likes · 12 min read
How MaxCompute Evolves Big Data Platforms for AI: Architecture, Core Capabilities, and Real‑World Cases
DataFunSummit
DataFunSummit
Mar 16, 2026 · Big Data

How MaxCompute Evolves into an AI‑Native Data Warehouse: Architecture, Capabilities, and Real‑World Cases

This article outlines MaxCompute's 15‑year transformation from a traditional structured‑compute engine to an AI‑native data warehouse, detailing its data, heterogeneous compute, and model capabilities, showcasing three core ability pillars, real‑world case studies, and future development directions.

AI-nativeBig DataCloud XPU
0 likes · 7 min read
How MaxCompute Evolves into an AI‑Native Data Warehouse: Architecture, Capabilities, and Real‑World Cases
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Jan 29, 2026 · Cloud Native

How Alibaba Cloud’s MaxCompute Powers Multi‑Modal AI Data Processing for MOSI Intelligence

In the era of rapid AI advancement, MOSI Intelligence faced IDC storage, compute, and network bottlenecks for large‑scale audio‑video pipelines, prompting a partnership with Alibaba Cloud to build a cloud‑native, one‑stop multi‑modal data processing platform using MaxCompute and the custom MaxFrame engine, dramatically improving performance and operational efficiency.

AI Data PlatformCloud NativeMaxCompute
0 likes · 8 min read
How Alibaba Cloud’s MaxCompute Powers Multi‑Modal AI Data Processing for MOSI Intelligence
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Nov 19, 2025 · Big Data

How We Migrated 100k BigQuery SQL Scripts to MaxCompute Using AST and LLM Automation

This article details a real‑world migration of a Southeast Asian tech group’s data warehouse from Google BigQuery to Alibaba Cloud MaxCompute, describing the challenges of converting 100,000 SQL scripts, the AST‑driven and LLM‑assisted automation pipeline, rule‑engine iteration, quality control, and the measurable performance and cost benefits achieved.

ASTBigQueryLLM
0 likes · 12 min read
How We Migrated 100k BigQuery SQL Scripts to MaxCompute Using AST and LLM Automation
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Sep 19, 2025 · Big Data

How MMS Powered a 50 PB BigQuery‑to‑MaxCompute Migration for GoTerra

This article details GoTerra's massive six‑month, 50 PB migration from GCP BigQuery to Alibaba Cloud MaxCompute, covering project scope, technical challenges such as complex data types, partition strategies, and high‑speed requirements, and explaining how the MaxCompute Migration Service (MMS) solved them with innovative architecture, scheduling, and data‑reorder techniques.

Big DataBigQueryData Migration
0 likes · 14 min read
How MMS Powered a 50 PB BigQuery‑to‑MaxCompute Migration for GoTerra
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Sep 10, 2025 · Big Data

Unlock Seamless BigQuery to MaxCompute Migration with dbt‑maxcompute

This article details the real‑world migration of Southeast Asian tech leader GoTerra from BigQuery to MaxCompute, showcasing how the open‑source dbt‑maxcompute adapter enables smooth ELT transitions, advanced incremental strategies, performance gains, ecosystem compatibility, and comprehensive best‑practice implementations for large‑scale data pipelines.

Big DataData MigrationELT
0 likes · 13 min read
Unlock Seamless BigQuery to MaxCompute Migration with dbt‑maxcompute
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Aug 29, 2025 · Big Data

How MaxCompute Streaming Insert Revolutionized Real‑Time Data Migration from BigQuery

This article details how a leading Southeast Asian tech group migrated its real‑time write workloads from Google BigQuery to MaxCompute using MaxCompute Streaming Insert, covering architecture, core features, migration challenges, optimization strategies, business impact, and future enhancements.

Big DataBigQuery MigrationMaxCompute
0 likes · 9 min read
How MaxCompute Streaming Insert Revolutionized Real‑Time Data Migration from BigQuery
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Aug 26, 2025 · Big Data

How MaxCompute Evolves for Python & AI: From SDK to Native Distributed Engine

This article outlines MaxCompute's decade‑long evolution—from the early PyODPS SDK to the native Distributed Python Engine—highlights the challenges big‑data platforms face in the AI era, and showcases Data+AI solutions and real‑world case studies across multimodal processing, massive text deduplication, and autonomous‑driving data pipelines.

AI FunctionsBig DataData+AI
0 likes · 15 min read
How MaxCompute Evolves for Python & AI: From SDK to Native Distributed Engine
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Aug 19, 2025 · Big Data

Cut Shuffle Costs by 60% with MaxCompute’s Cluster Optimization Tool

MaxCompute’s new Cluster Optimization Recommendation analyzes 31 days of shuffle data to automatically suggest optimal hash clustering keys, dramatically cutting shuffle traffic and CU consumption for large jobs, while providing one‑click ALTER TABLE scripts and detailed benefit reports to boost big‑data processing efficiency.

Big DataCost reductionHash Clustering
0 likes · 8 min read
Cut Shuffle Costs by 60% with MaxCompute’s Cluster Optimization Tool
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Aug 15, 2025 · Big Data

How MaxCompute Extends SQL to Match BigQuery – New Features & Compatibility

This article details the challenges of migrating 100,000+ SQL statements from BigQuery to MaxCompute for a leading Southeast Asian tech group, explains the new MaxCompute SQL syntax, auto‑partition and ingestion‑time tables, enhanced built‑in functions, and the BigQuery compatibility mode that ensures seamless query behavior.

BigQueryCompatibilityData Migration
0 likes · 15 min read
How MaxCompute Extends SQL to Match BigQuery – New Features & Compatibility
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Aug 13, 2025 · Big Data

How ODPS Evolved Over 15 Years into a Next‑Gen AI‑Ready Big Data Platform

This article chronicles ODPS's 15‑year journey from its exploratory beginnings to a modern, AI‑enabled big data platform, detailing its four development phases, architectural layers, SQL engine upgrades, real‑time processing, lakehouse integration, and the new Data+AI capabilities offered by MaxCompute and DataWorks.

AI integrationBig DataDataWorks
0 likes · 12 min read
How ODPS Evolved Over 15 Years into a Next‑Gen AI‑Ready Big Data Platform
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Aug 5, 2025 · Big Data

How Alibaba Built a World‑Class Big Data Platform Over a Decade

Over ten years, Alibaba’s data engineers transformed a modest Hadoop‑based system into a globally‑scalable, high‑performance big data platform—ODPS/MaxCompute—supporting massive offline and real‑time workloads, pioneering innovations like the 5K cluster expansion, Blink streaming, and the unified ‘Moon’ migration.

AlibabaBig DataData Platform
0 likes · 25 min read
How Alibaba Built a World‑Class Big Data Platform Over a Decade
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Aug 1, 2025 · Information Security

How MaxCompute Revamped Enterprise Permissions for Secure Data Migration

This article details how a Southeast Asian tech giant migrated from Google BigQuery to Alibaba Cloud MaxCompute, redesigning its permission architecture with multi‑level access control, namespace‑based hierarchies, role inheritance, policy‑tag driven dynamic data masking, and cross‑account user management to meet strict security and compliance requirements.

Cross-AccountDynamic MaskingMaxCompute
0 likes · 21 min read
How MaxCompute Revamped Enterprise Permissions for Secure Data Migration
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Jul 29, 2025 · Big Data

How GoTerra Cut Costs and Boost Speed: BigQuery‑to‑MaxCompute Performance Secrets

This article details the real‑world migration of a leading Southeast Asian tech group from BigQuery to MaxCompute, exposing the three major challenges, the data‑driven performance‑optimization methodology, and the concrete techniques—Auto Partition, UNNEST redesign, large‑query graph optimizations, and intelligent tuning—that delivered dramatic cost reductions and query‑speed gains.

Auto PartitionBig DataData Warehouse Migration
0 likes · 17 min read
How GoTerra Cut Costs and Boost Speed: BigQuery‑to‑MaxCompute Performance Secrets
Big Data Technology & Architecture
Big Data Technology & Architecture
Apr 17, 2025 · Big Data

MaxCompute: Intelligent Data Warehouse Platform for the Data+AI Era

This article, based on a meetup presentation, details Alibaba Cloud's MaxCompute platform—its evolution, serverless architecture, AI integration, distributed Python framework, Object Table, near‑real‑time processing, and intelligent warehouse features—addressing the challenges of data warehouses in the Data+AI era.

Big DataMaxComputeObject Table
0 likes · 11 min read
MaxCompute: Intelligent Data Warehouse Platform for the Data+AI Era
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Mar 31, 2025 · Artificial Intelligence

Unlock AI-Powered Data Processing with MaxFrame’s AI Function

This article introduces MaxFrame’s AI Function, a new feature built on MaxCompute that integrates large language models like Qwen 2.5 and DeepSeek‑R1‑Distill‑Qwen to simplify model deployment and enable scalable text classification, information extraction, summarization, translation, and other AI-driven data processing tasks on massive datasets.

AI FunctionMaxComputeMaxFrame
0 likes · 19 min read
Unlock AI-Powered Data Processing with MaxFrame’s AI Function
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 27, 2025 · Artificial Intelligence

Unlock Massive Data with AI: MaxFrame’s AI Function Makes LLM-Powered Analytics Easy

This article introduces MaxFrame’s AI Function on Alibaba Cloud’s MaxCompute platform, detailing how built‑in large language models like Qwen 2.5 and DeepSeek‑R1 enable seamless text classification, information extraction, summarization, and more through simple Python APIs and distributed processing.

AI FunctionMaxComputeMaxFrame
0 likes · 21 min read
Unlock Massive Data with AI: MaxFrame’s AI Function Makes LLM-Powered Analytics Easy
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Mar 26, 2025 · Big Data

Cutting Compute Costs with MaxCompute Materialized Views: Strategies and Results

This article details how MaxCompute leverages fuzzy materialized views, DAG scheduling adjustments, public layer mining, and FBI acceleration techniques to reduce compute resource consumption by up to 10%, improve task visibility, and achieve significant daily savings in large‑scale data warehouse environments.

Compute costMaxComputematerialized view
0 likes · 12 min read
Cutting Compute Costs with MaxCompute Materialized Views: Strategies and Results
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Feb 28, 2025 · Databases

How MaxCompute’s Intelligent Data Warehouse Optimizes Queries with AutoMV

This article explains MaxCompute’s intelligent data warehouse architecture, its self‑learning optimization pipeline, the role of intelligent materialized views, the automated recommendation system for materialized views, and the AutoMV feature that automatically creates, updates, and cleans up materialized views to reduce compute costs and improve query performance.

AutoMVBig DataMaxCompute
0 likes · 17 min read
How MaxCompute’s Intelligent Data Warehouse Optimizes Queries with AutoMV
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Feb 14, 2025 · Big Data

How MaxCompute Powers Intelligent Data Warehousing in the Data+AI Era

This article summarizes a meetup talk by Alibaba Cloud expert Yu Deshui, detailing MaxCompute’s evolution, serverless architecture, AI‑enabled features, and the platform’s comprehensive solutions—including OpenLake, MaxFrame, Object Table, near‑real‑time computing, and AI Functions—to address the challenges of modern data‑centric AI workloads.

AI integrationBig DataMaxCompute
0 likes · 13 min read
How MaxCompute Powers Intelligent Data Warehousing in the Data+AI Era
dbaplus Community
dbaplus Community
Feb 9, 2025 · Big Data

Mastering ODPS SQL Performance: From Logview to Advanced Optimizations

This guide walks through the end‑to‑end flow of SQL execution on Alibaba MaxCompute (ODPS), explains how to use Logview to pinpoint performance bottlenecks, enumerates common causes of slow queries, and presents concrete optimization techniques such as MapJoin hints, double‑group‑by rewrites, TRANS_COLS, bucket partitioning and UDF tuning, all illustrated with step‑by‑step examples and visual diagrams.

LogviewMaxComputeODPS
0 likes · 16 min read
Mastering ODPS SQL Performance: From Logview to Advanced Optimizations
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Jan 9, 2025 · Big Data

How Dynamic Filters Supercharge MaxCompute Joins and Cut CPU by 70%

MaxCompute’s dynamic filter and dynamic partition pruning features dramatically accelerate cross‑period join queries by generating runtime filters that prune irrelevant data before the shuffle, reducing scanned data volume by over 95%, cutting CPU usage by 70% and slashing query latency in large‑scale merchant billing workloads.

Big DataDynamic FilterJoin Performance
0 likes · 11 min read
How Dynamic Filters Supercharge MaxCompute Joins and Cut CPU by 70%
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Dec 19, 2024 · Big Data

MaxCompute Bloomfilter Index: Faster Emergency Tracing Queries, Reduced Storage

The article explains how MaxCompute’s newly introduced Bloomfilter index dramatically improves emergency data tracing by cutting query time and resource consumption, replacing costly secondary indexes, reducing storage by over 45%, and providing a lightweight, high‑efficiency solution for large‑scale point‑lookup scenarios.

Big DataBloomFilterMaxCompute
0 likes · 12 min read
MaxCompute Bloomfilter Index: Faster Emergency Tracing Queries, Reduced Storage
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Sep 29, 2024 · Big Data

How MaxCompute Is Shaping the Next‑Gen Intelligent Cloud Data Warehouse

The 2024 Cloud Conference showcased Alibaba Cloud's MaxCompute, detailing its four‑pillar roadmap—Data+AI integration, Lakehouse 2.0, near‑real‑time analytics, and enterprise‑grade capabilities—while highlighting performance breakthroughs, security features, and a real‑world case study from Juewei Group.

MaxComputecloud data warehouseenterprise security
0 likes · 10 min read
How MaxCompute Is Shaping the Next‑Gen Intelligent Cloud Data Warehouse
DataFunSummit
DataFunSummit
Sep 25, 2024 · Big Data

Evolution of Big Data AI Development Paradigm and Alibaba Cloud’s Integrated Architecture

This article examines how large‑scale big‑data platforms can simplify AI application development, outlines the shift from model‑centric to data‑centric paradigms, and shares Alibaba Cloud’s practical experiences in building an integrated big‑data‑AI architecture, including MaxCompute, Hologres, MaxFrame, and vector search capabilities.

AI integrationBig DataData Platform
0 likes · 19 min read
Evolution of Big Data AI Development Paradigm and Alibaba Cloud’s Integrated Architecture
DaTaobao Tech
DaTaobao Tech
Sep 13, 2024 · Big Data

Extending PyODPS with PAI‑Designer for Dynamic Offline Data Processing

By integrating PAI‑Designer with PyODPS, users can build visual offline workflows that overcome ODPS’s lack of network access, dynamic configuration, and image‑processing limits, using reusable Python components, OSS role‑ARNs, remote configuration fetching, and custom Docker images to read/write MaxCompute and OSS data.

DockerMaxComputePAI-Designer
0 likes · 19 min read
Extending PyODPS with PAI‑Designer for Dynamic Offline Data Processing
DaTaobao Tech
DaTaobao Tech
Sep 11, 2024 · Big Data

Practical Guide to Using PyODPS for Flexible Data Processing

The article walks through a first‑time user’s experience with PyODPS, showing how its Python‑based DataFrame API offers more flexible JSON field statistics, multi‑condition filtering, and custom aggregations than traditional ODPS SQL, while noting a steep learning curve and syntax quirks.

MaxComputePyODPSPython
0 likes · 11 min read
Practical Guide to Using PyODPS for Flexible Data Processing
DaTaobao Tech
DaTaobao Tech
Jul 10, 2024 · Big Data

ODPS Development Guide: Parameters, Built‑in Functions, UDF Creation, and Performance Optimization

This comprehensive ODPS (MaxCompute) development guide serves as a mini‑encyclopedia, detailing common parameter tuning, built‑in SQL functions, step‑by‑step Java UDF creation, job lifecycle insights, and practical performance‑optimization techniques such as parallelism adjustment, map‑join hints, and small‑file mitigation.

MaxComputeODPSUDF
0 likes · 26 min read
ODPS Development Guide: Parameters, Built‑in Functions, UDF Creation, and Performance Optimization
DataFunSummit
DataFunSummit
Jul 9, 2024 · Big Data

Materialized Views in MaxCompute: Design, Implementation, and Best Practices

This article explains the concept, advantages, and drawbacks of materialized views, describes how MaxCompute implements them—including creation syntax, maintenance properties, automatic query rewrite, smart recommendation, and auto‑materialization—and shares performance results and future improvement plans.

Automatic RefreshBig DataMaxCompute
0 likes · 13 min read
Materialized Views in MaxCompute: Design, Implementation, and Best Practices
DataFunTalk
DataFunTalk
Jun 26, 2024 · Big Data

Evolution of the Big Data + AI Development Paradigm and Alibaba Cloud’s Integrated Architecture

This article examines how the big‑data AI development paradigm has shifted from model‑centric to data‑centric workflows, outlines the challenges of integrating data and AI teams, and details Alibaba Cloud’s end‑to‑end, serverless big‑data platform—including MaxCompute, Hologres, MaxFrame, Object Table, and vector search—designed to accelerate large‑scale AI applications.

AI integrationBig DataData Platform
0 likes · 20 min read
Evolution of the Big Data + AI Development Paradigm and Alibaba Cloud’s Integrated Architecture
Alibaba Cloud Developer
Alibaba Cloud Developer
May 27, 2024 · Big Data

How MaxCompute’s New Offline‑Near‑Real‑Time Architecture Revolutionizes Big Data Workloads

This article explains how MaxCompute’s integrated offline‑and‑near‑real‑time architecture, built on Delta Table, solves complex big‑data scenarios by providing unified storage, ACID transactions, upsert, time‑travel, automatic data‑file governance and low‑latency query capabilities while reducing cost and operational complexity.

Delta TableMaxComputedata-warehouse
0 likes · 27 min read
How MaxCompute’s New Offline‑Near‑Real‑Time Architecture Revolutionizes Big Data Workloads
Alibaba Cloud Developer
Alibaba Cloud Developer
Apr 30, 2024 · Big Data

Mastering ODPS SQL: Proven Tips to Slash Query Time and Tackle Data Skew

This article explores practical SQL optimization techniques for Alibaba's ODPS platform, covering fundamentals, common pitfalls like null handling and select *, advanced strategies such as multi‑insert, partition limiting, UDF placement, data‑skew mitigation, parameter tuning, and real‑world case studies that dramatically reduce query runtimes.

Big DataData SkewMaxCompute
0 likes · 23 min read
Mastering ODPS SQL: Proven Tips to Slash Query Time and Tackle Data Skew
DataFunTalk
DataFunTalk
Apr 16, 2024 · Big Data

Materialized Views in MaxCompute: Design, Implementation, and Best Practices

This article explains how MaxCompute leverages materialized views as a query accelerator, covering their history, advantages and drawbacks, creation and maintenance details, automatic query rewriting, intelligent recommendation, auto‑materialization, and future enhancements for large‑scale data warehousing.

Automatic RefreshBig DataIntelligent Recommendation
0 likes · 13 min read
Materialized Views in MaxCompute: Design, Implementation, and Best Practices
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Apr 16, 2024 · Big Data

MaxCompute’s Integrated Offline & Near‑Real‑Time Architecture: Transaction Table 2.0 Explained

This article explains MaxCompute’s new integrated offline‑and‑near‑real‑time architecture, Transaction Table 2.0, detailing its unified storage and compute design, automatic data governance, schema evolution, upsert and time‑travel capabilities, and how it simplifies complex big‑data pipelines while delivering minute‑level latency and lower costs.

Big DataData GovernanceMaxCompute
0 likes · 27 min read
MaxCompute’s Integrated Offline & Near‑Real‑Time Architecture: Transaction Table 2.0 Explained
Architects' Tech Alliance
Architects' Tech Alliance
Mar 8, 2024 · Databases

What’s Driving the Rise of China’s Home‑grown Databases in 2024?

The article reviews the rapid emergence of high‑performance domestic databases such as OceanBase, TiDB, MaxCompute and TDEngine, outlines their development history, compares strengths and weaknesses, discusses suitable scenarios for TiDB, and forecasts future trends for China’s database ecosystem, while providing a curated list of further readings.

ChinaMaxComputeOceanBase
0 likes · 5 min read
What’s Driving the Rise of China’s Home‑grown Databases in 2024?
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Mar 1, 2024 · Big Data

Scaling U‑App Analytics to Billions of Events with Flink, MaxCompute & Hologres

UMeng+’s U‑App analytics platform processes nearly a trillion daily logs by combining real‑time Flink streams, offline MaxCompute batches, and Alibaba Cloud Hologres OLAP, employing multi‑engine architecture, smart sampling, and Roaring Bitmap techniques to deliver fast, cost‑effective, high‑concurrency user behavior and profiling analysis.

FlinkHologresMaxCompute
0 likes · 19 min read
Scaling U‑App Analytics to Billions of Events with Flink, MaxCompute & Hologres
DataFunTalk
DataFunTalk
Jan 1, 2024 · Big Data

MaxCompute Semi-Structured Data: Concepts, Solutions, and Benefits

This article explains the nature of semi‑structured data, compares traditional schema‑on‑read and schema‑on‑write approaches, and details MaxCompute's columnar storage solution that balances flexibility, performance, and cost for large‑scale data warehouses.

Big DataColumnar StorageMaxCompute
0 likes · 19 min read
MaxCompute Semi-Structured Data: Concepts, Solutions, and Benefits
ITPUB
ITPUB
Dec 25, 2023 · Big Data

Unlock Complex Data Scenarios with Simple MaxCompute SQL Techniques

This article shows how flexible, divergent thinking combined with basic MaxCompute (ODPS) SQL syntax can solve complex data problems such as generating sequences, splitting intervals, performing permutations and combinations, and analyzing continuous activity, providing step‑by‑step examples, SQL code snippets, and practical results.

IntervalsMaxComputeSequences
0 likes · 24 min read
Unlock Complex Data Scenarios with Simple MaxCompute SQL Techniques
DataFunTalk
DataFunTalk
Nov 12, 2023 · Big Data

MaxCompute Incremental Update Architecture, Intelligent Materialized Views, and Adaptive Execution Optimizations

This article presents a comprehensive overview of MaxCompute's near‑real‑time incremental update and processing architecture, the design of Transactional Table 2.0, intelligent materialized view evolution and recommendation, as well as multi‑level adaptive execution optimizations for the SQL engine, illustrating how these innovations improve efficiency, cost, and scalability for large‑scale data workloads.

Adaptive ExecutionMaxComputeSQL Engine
0 likes · 20 min read
MaxCompute Incremental Update Architecture, Intelligent Materialized Views, and Adaptive Execution Optimizations
DaTaobao Tech
DaTaobao Tech
Oct 11, 2023 · Big Data

Fundamental Data Skills and Complex Query Techniques in MaxCompute

The article teaches developers essential MaxCompute data‑processing skills—from creating and naming tables, handling strings and dates, and writing basic SELECTs, joins, and aggregations, to employing advanced techniques such as temporary tables, CTEs, partitioning, and map‑join hints for efficient complex queries.

ETLMaxComputedata engineering
0 likes · 15 min read
Fundamental Data Skills and Complex Query Techniques in MaxCompute
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Oct 9, 2023 · Big Data

How We Cut MaxCompute Costs Using Information Schema Insights

This article details how a fast‑growing HR SaaS company analyzed MaxCompute billing spikes, identified five key cost drivers, leveraged tenant‑level Information Schema to extract task metadata, applied SQL‑based cost formulas, and implemented targeted optimizations that stabilized their cloud data‑processing expenses.

Big DataCost OptimizationInformation Schema
0 likes · 10 min read
How We Cut MaxCompute Costs Using Information Schema Insights
DataFunSummit
DataFunSummit
Sep 7, 2023 · Big Data

MaxCompute Semi-Structured Data Solutions: Architecture, Comparison, and Performance Benefits

This article explains the concepts of semi‑structured data, compares traditional schema‑on‑read and schema‑on‑write approaches, and details MaxCompute's columnar storage solution—including AliORC, adaptive query processing, and handling of dirty or sparse data—to achieve high performance and low cost in big‑data warehousing.

MaxComputesemi-structured data
0 likes · 20 min read
MaxCompute Semi-Structured Data Solutions: Architecture, Comparison, and Performance Benefits
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Aug 30, 2023 · Big Data

How Transaction Table2.0 Cuts Data Deduplication Costs by 98% in MaxCompute

This article explains how Renliji's data warehouse team leveraged MaxCompute's Transaction Table2.0 to dramatically reduce incremental data deduplication costs and execution time, while also introducing efficient small‑file merging, time‑travel queries, and future data‑sync strategies for a high‑growth HR SaaS platform.

Big DataCost OptimizationMaxCompute
0 likes · 11 min read
How Transaction Table2.0 Cuts Data Deduplication Costs by 98% in MaxCompute
DataFunTalk
DataFunTalk
Aug 29, 2023 · Big Data

MaxCompute Incremental Update, Processing Architecture, and Intelligent Data Warehouse Optimizations

This article presents a comprehensive overview of MaxCompute's incremental update and processing architecture, the design of intelligent materialized views, and the engine's adaptive execution optimizations, detailing the integrated near‑real‑time and batch pipelines, transactional table 2.0, and practical Q&A.

Big DataMaxComputedata-warehouse
0 likes · 21 min read
MaxCompute Incremental Update, Processing Architecture, and Intelligent Data Warehouse Optimizations
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Jun 27, 2023 · Big Data

How MaxCompute’s Lakehouse Architecture Enables Near‑Real‑Time Incremental Processing

This article details Alibaba Cloud MaxCompute’s lakehouse evolution, describing its unified storage‑metadata‑compute design, the Transactional Table 2.0 format, near‑real‑time incremental ingestion, clustering and compaction services, transaction handling, TimeTravel and incremental queries, and future roadmap for big‑data workloads.

Big DataIncremental ProcessingLakehouse
0 likes · 23 min read
How MaxCompute’s Lakehouse Architecture Enables Near‑Real‑Time Incremental Processing
DataFunTalk
DataFunTalk
Jun 24, 2023 · Big Data

Design and Architecture of MaxCompute Lakehouse Near‑Real‑Time Incremental Processing

This article explains the evolution of Alibaba Cloud's MaxCompute platform into a lakehouse architecture that supports near‑real‑time incremental processing, detailing its development history, core design of transactional tables, five‑module technical stack, data ingestion methods, optimization services, transaction management, query capabilities, ecosystem integration, practical applications, future roadmap, and common user questions.

Big DataData LakeIncremental Processing
0 likes · 24 min read
Design and Architecture of MaxCompute Lakehouse Near‑Real‑Time Incremental Processing
Data Thinking Notes
Data Thinking Notes
Jan 12, 2023 · Big Data

Mastering Alibaba DataWorks: Data Warehouse Architecture & Modeling Guide

This comprehensive tutorial walks you through Alibaba DataWorks' data warehouse architecture, covering technical stack selection, three‑layer warehouse design (ODS, CDM, ADS), detailed data modeling with DDL examples, storage strategies, dimension and fact table conventions, and best‑practice hierarchical call standards.

DataModelingDataWarehouseDataWorks
0 likes · 27 min read
Mastering Alibaba DataWorks: Data Warehouse Architecture & Modeling Guide
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 26, 2022 · Backend Development

How to Build a Scalable Tag/Profile System for Marketing Automation

This article shares engineering practices for constructing a tag‑profile system, covering core concepts, minimal architecture, technology selection, key modules such as estimation, selection, deployment, and validation, and offers design details and implementation tips for large‑scale marketing scenarios.

Alibaba CloudBackend ArchitectureMarketing Automation
0 likes · 11 min read
How to Build a Scalable Tag/Profile System for Marketing Automation
DataFunTalk
DataFunTalk
Dec 13, 2022 · Artificial Intelligence

End-to-End Machine Learning Application Using OpenMLDB and Alibaba Cloud MaxCompute

This article demonstrates how to build a complete end-to-end machine-learning workflow for taxi trip duration prediction by integrating OpenMLDB with Alibaba Cloud MaxCompute’s serverless services, covering environment setup, offline data ingestion, feature extraction, model training, deployment, and real-time online inference within 20 ms.

Feature StoreMaxComputeOpenMLDB
0 likes · 13 min read
End-to-End Machine Learning Application Using OpenMLDB and Alibaba Cloud MaxCompute
DataFunTalk
DataFunTalk
Aug 19, 2022 · Big Data

Multi‑Tenant Architecture in Public‑Cloud Big Data Platforms: Design, Challenges, and MaxCompute Implementation

This article explains the different multi‑tenant models used by public‑cloud big data platforms, analyzes their advantages and challenges, and details how Alibaba Cloud's MaxCompute realizes strong multi‑tenancy through storage design, resource scheduling, security containers, virtual networking, and future evolution directions.

MaxComputedata securitymulti-tenancy
0 likes · 15 min read
Multi‑Tenant Architecture in Public‑Cloud Big Data Platforms: Design, Challenges, and MaxCompute Implementation
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Aug 9, 2022 · Big Data

Unlocking MaxCompute: How Alibaba’s Big Data Platform Secures Your Data

This article provides a comprehensive overview of Alibaba Cloud MaxCompute, covering its product features, architecture, ecosystem integrations, and in‑depth data security mechanisms such as authentication, RAM roles, access control policies, label‑based security, project protection, audit logging, encryption, backup, disaster recovery, and the complementary DataWorks security capabilities.

Big DataCloud NativeMaxCompute
0 likes · 31 min read
Unlocking MaxCompute: How Alibaba’s Big Data Platform Secures Your Data
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Jul 19, 2022 · Big Data

Real-Time & Offline Data Warehouse Integration: New Capabilities Explained

This article provides an overview of real-time and offline integrated data warehousing, tracing its evolution from early offline warehouses to modern cloud-native solutions, and details the latest capabilities—including multi-engine computation, data sharing between MaxCompute and Hologres, progressive computing, materialized views, and practical use cases such as telecom analytics and connected‑car scenarios.

HologresMaxComputecloud-native
0 likes · 16 min read
Real-Time & Offline Data Warehouse Integration: New Capabilities Explained
DaTaobao Tech
DaTaobao Tech
Apr 27, 2022 · Big Data

Comparative Study of JSON Processing Methods in MaxCompute

The study compares MaxCompute JSON extraction functions—FROM_JSON, get_json_object, and custom JMESPath/JSONPath UDFs—showing simple field extraction with get_json_object is fastest, while complex queries benefit from FROM_JSON or JMESPath, and outlines corresponding JSON generation methods and best‑practice recommendations.

DataEngineeringJMESPathJSON
0 likes · 11 min read
Comparative Study of JSON Processing Methods in MaxCompute
BaiPing Technology
BaiPing Technology
Mar 14, 2022 · Big Data

Mastering DataWorks & MaxCompute: A Complete Guide to Big Data Architecture and Governance

DataWorks, Alibaba Cloud’s comprehensive PaaS platform, combined with the serverless MaxCompute data warehouse, offers an integrated solution for data integration, development, quality, and services, while detailed naming and layer conventions ensure scalable, maintainable big‑data architectures and effective governance across ODS, CDM, DWD, DWS, and ADS layers.

Big DataData GovernanceDataWorks
0 likes · 8 min read
Mastering DataWorks & MaxCompute: A Complete Guide to Big Data Architecture and Governance
Sohu Tech Products
Sohu Tech Products
Apr 7, 2021 · Big Data

Data Warehouse Architecture and Modeling with Alibaba MaxCompute and DataWorks

This tutorial explains how to select a technical architecture, design a three‑layer data warehouse (ODS, CDM, ADS), model tables and dimensions, choose storage strategies, handle slowly changing dimensions, synchronize data with DataWorks, and implement dimensional modeling and fact tables using Alibaba MaxCompute for big‑data analytics.

Big DataDataWorksMaxCompute
0 likes · 32 min read
Data Warehouse Architecture and Modeling with Alibaba MaxCompute and DataWorks
Alibaba Cloud Developer
Alibaba Cloud Developer
Jan 2, 2020 · Big Data

How Alibaba’s MaxCompute Tackled Double‑11’s EB‑Scale Data with Fuxi 2.0 and StreamlineX

In 2019 Alibaba’s MaxCompute processed near‑exabyte daily data during Double 11, using the newly released Fuxi 2.0 scheduler, StreamlineX + Shuffle Service, and the upgraded DAG 2.0 engine to overcome massive throughput, resource‑allocation, and fault‑tolerance challenges while achieving significant performance and stability gains.

DAG 2.0FuxiMaxCompute
0 likes · 28 min read
How Alibaba’s MaxCompute Tackled Double‑11’s EB‑Scale Data with Fuxi 2.0 and StreamlineX
Alibaba Cloud Developer
Alibaba Cloud Developer
Aug 22, 2019 · Big Data

How AliORC Supercharges MaxCompute: Inside the Next‑Gen Columnar Format

This article explains how Alibaba's MaxCompute platform evolved its storage engine from row‑based CFile to the columnar AliORC format, details the technical innovations such as async prefetch, small I/O elimination, adaptive dictionary encoding, and range‑aligned reads, and compares its performance against Apache ORC and Parquet.

AliORCApache ORCColumnar Storage
0 likes · 20 min read
How AliORC Supercharges MaxCompute: Inside the Next‑Gen Columnar Format
Youku Technology
Youku Technology
Aug 15, 2019 · Big Data

Youku's Migration from Hadoop to Alibaba Cloud MaxCompute: Benefits and Technical Insights

Youku’s 2017 migration from an on‑premises Hadoop cluster to Alibaba Cloud MaxCompute delivered a unified, elastic data pipeline that cut compute and storage costs by roughly half, handled billions of daily log records, boosted performance and scalability, and empowered analysts with self‑service tools and a rich ecosystem.

Big DataCost OptimizationData Migration
0 likes · 12 min read
Youku's Migration from Hadoop to Alibaba Cloud MaxCompute: Benefits and Technical Insights
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 9, 2019 · Big Data

How Youku Cut Costs and Boost Performance by Migrating to MaxCompute

This article explains how Youku processed billions of daily logs, migrated from Hadoop to Alibaba Cloud MaxCompute in 2017, and achieved lower compute and storage costs, faster data delivery, and greater operational flexibility through a robust big‑data platform tailored to its complex business needs.

Cost OptimizationData MigrationMaxCompute
0 likes · 12 min read
How Youku Cut Costs and Boost Performance by Migrating to MaxCompute
Alibaba Cloud Developer
Alibaba Cloud Developer
Apr 18, 2019 · Big Data

How MaxCompute Evolved: 10 Years of Big Data Innovation at Alibaba

This article reviews a decade of MaxCompute development, covering its origins, core technologies, performance gains, ecosystem integration, intelligent features, competitive positioning, and commercialization, while highlighting the platform's role as Alibaba's central big‑data compute engine.

AI integrationBig DataMaxCompute
0 likes · 21 min read
How MaxCompute Evolved: 10 Years of Big Data Innovation at Alibaba
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Nov 20, 2018 · Big Data

A Decade of Alibaba's Big Data Platform Evolution Through Double 11

The article chronicles Alibaba's ten‑year journey of building and scaling its big data platform—from early Oracle clusters and Hadoop‑based Cloud‑Ladder 1 to the self‑developed ODPS/MaxCompute, real‑time Blink engine, and the unified DataWorks ecosystem—highlighting key technical milestones, performance breakthroughs, and operational challenges that powered successive Double 11 shopping festivals.

AlibabaData PlatformMaxCompute
0 likes · 22 min read
A Decade of Alibaba's Big Data Platform Evolution Through Double 11
Alibaba Cloud Developer
Alibaba Cloud Developer
Nov 15, 2018 · Big Data

How Alibaba Built a World‑Class Big Data Platform Over a Decade

This article chronicles Alibaba's ten‑year journey of building and scaling its big‑data platform—from early Oracle clusters and Hadoop, through the launch of ODPS and MaxCompute, to global cloud expansion and cutting‑edge streaming innovations that now power billions of transactions each Double‑11.

AlibabaData PlatformMaxCompute
0 likes · 23 min read
How Alibaba Built a World‑Class Big Data Platform Over a Decade
Efficient Ops
Efficient Ops
Aug 29, 2018 · Big Data

How DataOps and Linear Programming Optimize MaxCompute Capacity Management

This article explains how Alibaba's MaxCompute platform tackles capacity bottlenecks by combining data‑driven insights, linear programming, and automated project migration strategies to predict resource needs, optimize cluster allocation, and quantify migration impacts for improved operational efficiency.

DataOpsLinear ProgrammingMaxCompute
0 likes · 13 min read
How DataOps and Linear Programming Optimize MaxCompute Capacity Management
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 23, 2018 · Big Data

How Alibaba’s MaxCompute Became the Backbone of 99% Data Processing

This article reviews Alibaba's MaxCompute evolution from ODPS to a unified, multi‑cluster big‑data platform, detailing its architecture, development tools, large‑scale deployments, performance optimizations, typical workload scenarios, and why it is the preferred choice for enterprise data processing.

Alibaba CloudBig DataData Platform
0 likes · 22 min read
How Alibaba’s MaxCompute Became the Backbone of 99% Data Processing
21CTO
21CTO
Apr 2, 2018 · Big Data

How to Build a Scalable Friend Recommendation System with MaxCompute

This article explains how to leverage Alibaba Cloud's MaxCompute and MapReduce to design, model, and deploy a large‑scale social friend recommendation system, covering data requirements, analysis models, cloud architecture, and practical development steps.

Friend RecommendationMaxComputedata-warehouse
0 likes · 12 min read
How to Build a Scalable Friend Recommendation System with MaxCompute
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 7, 2016 · Big Data

How Alibaba Handled Real‑Time Billions of Events During Double 11

This article outlines Alibaba Cloud's big‑data platform challenges and solutions during the 2016 Double 11 event, covering sub‑second real‑time processing, multi‑million‑records‑per‑second throughput, full‑day high availability, and massive offline workloads exceeding hundreds of petabytes.

AlibabaDistributed SystemsMaxCompute
0 likes · 3 min read
How Alibaba Handled Real‑Time Billions of Events During Double 11
Architecture Digest
Architecture Digest
Nov 6, 2016 · Big Data

Evolution of Taobao’s Big Data Platform: From RAC to MaxCompute

The article chronicles Taobao’s 13‑year evolution of its big data platform, detailing three phases—from a single‑node Oracle setup and the Tianwang scheduler, through a Hadoop‑based “Cloud Ladder 1” architecture with real‑time analytics, to the current MaxCompute/ODPS era with cross‑region projects and advanced data services.

Big DataData PlatformHadoop
0 likes · 11 min read
Evolution of Taobao’s Big Data Platform: From RAC to MaxCompute