How MaxCompute Evolves into a Data+AI Platform: Architecture, Core Capabilities, and Real-World Cases
The article explains how Alibaba Cloud's MaxCompute has been transformed into a cloud‑native Data+AI platform, detailing its layered architecture, multimodal storage, model management, hybrid compute scheduling, SQL AI functions, the MaxFrame Python framework, and several enterprise case studies that demonstrate performance gains and flexible resource orchestration.
Architecture Overview
MaxCompute is organized from bottom to top into four layers: Data Layer, Model Layer, Compute Layer, and Engine Layer. The platform provides unified storage for structured and unstructured data, native Python‑based distributed computing, and SQL AI functions for offline inference.
Data Layer
Supports both structured tables and unstructured BLOB fields, enabling unified storage of audio, video, and other multimodal assets. Object Table and external storage connectors (OSS, Hologres, etc.) allow direct access to diverse storage engines without moving data.
Model Layer
Hosts traditional machine‑learning models such as XGBoost and LightGBM, open‑source large models (e.g., Qwen‑8B/14B/32B, DeepSeek‑R1‑Distill‑Qwen), and commercial flagship models from the Bailei platform, providing a single management interface for model registration, versioning, and service deployment.
Compute Layer
Provides hybrid CPU (CU) and GPU (GU) scheduling. Users declare required resources in job definitions, allowing heterogeneous compute allocation for multimodal data processing.
Engine Layer
Two primary interfaces are offered:
SQL Engine – SQL AI functions enable direct calls to large models for offline inference on structured data.
MaxFrame – A native Python distributed‑computing framework that integrates with Pandas, XGBoost, LightGBM and other open‑source libraries, executing operators across MaxCompute’s massive compute pool.
MaxFrame Core Capabilities
Heterogeneous Compute Scheduling – CPU and GPU resources can be mixed within a single job via programming APIs.
Distributed Data‑Processing Operators – Compatible with Pandas, XGBoost, LightGBM and other libraries; jobs run distributedly without local resource limits.
Stable Development Experience – Deep integration with DataWorks for interactive development and scheduling, support for custom Docker images, OSS mounting, and AI‑assistant integration.
Development Experience
The MaxFrame SDK is publicly available. Install locally with pip install maxframe and use in VS Code or Jupyter Notebook. DataWorks Notebook integrates MaxFrame sessions via Magic Command. PyODPS3 nodes enable MaxFrame job development, and MaxCompute Notebook embeds the MaxFrame SDK for interactive coding.
Key Scenarios and Case Studies
Large‑Model Data Pre‑Processing
A leading large‑model company processed petabyte‑scale data with >100 k elastic cores. Using MaxFrame’s MinHash operator, performance improved by over 50 %. A single task ran with 300 k cores (16 k elastic cores), exceeding the required 10 k cores and dramatically shortening PB‑level processing cycles.
Automotive Embodied‑Intelligence Data Pre‑Processing
In autonomous‑driving pipelines, vehicles generate multimodal streams (images, video, radar, GPS) stored as ROS bags. MaxFrame’s elastic compute resources handled peak loads, and distributed processing achieved more than a 40 % efficiency gain compared with single‑node Python.
Multimodal Image Tagging
MaxFrame’s built‑in AI Function calls large models (Qwen‑8B/14B/32B, DeepSeek‑R1‑Distill‑Qwen) via a simple SDK to perform large‑scale offline inference on structured and multimodal data, producing automatic image tags and vector embeddings for downstream retrieval.
Typical End‑to‑End Cases
A top‑tier large‑model firm replaced a local solution with MaxCompute + MaxFrame for FastText language classification, MinHash deduplication, and CI/CD orchestration in DataWorks.
Automotive autonomous‑driving pipelines used MaxFrame to process multimodal sensor data, achieving >40 % performance gains over single‑node Python.
Multimodal image labeling combined AI Function and embedding generation to support downstream search and analysis.
Conclusion
MaxCompute delivers a full‑link Data + AI capability from storage to SQL to Python, providing a cloud‑native, elastic, high‑performance foundation for AI data asset construction and intelligent application deployment across industries.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
