Cloud Native 8 min read

How Alibaba Cloud’s MaxCompute Powers Multi‑Modal AI Data Processing for MOSI Intelligence

In the era of rapid AI advancement, MOSI Intelligence faced IDC storage, compute, and network bottlenecks for large‑scale audio‑video pipelines, prompting a partnership with Alibaba Cloud to build a cloud‑native, one‑stop multi‑modal data processing platform using MaxCompute and the custom MaxFrame engine, dramatically improving performance and operational efficiency.

Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
How Alibaba Cloud’s MaxCompute Powers Multi‑Modal AI Data Processing for MOSI Intelligence

Technical Challenges

On‑premise IDC could not scale for high‑concurrency multimodal pipelines. Limitations included compute elasticity, storage capacity, I/O and network bandwidth, and inability to schedule thousands of GPUs and tens of thousands of CPU cores across tasks such as audio transcription, video frame extraction, and feature extraction. Lack of unified metadata for audio‑video assets made data invisible and hard to manage. Existing workflow relied on single‑machine Python scripts without visual development, scheduling, or monitoring, leading to high engineering effort.

Solution Architecture

Platform Components

MaxCompute – cloud‑native data processing engine providing elastic compute (monthly fixed + on‑demand) and high‑throughput storage via Object Storage Service (OSS).

MaxFrame – proprietary distributed AI engine built on MaxCompute. Features:

Standardized operators for audio‑video segmentation, speech recognition, and feature extraction.

Rebalance mechanism to split data and control concurrency, balancing memory usage versus throughput.

Heterogeneous resource scheduling: assigns operators to CPU, GPU, or specialized accelerators within a single pipeline.

Built‑in fault tolerance and auto‑scaling.

DataWorks – orchestration layer for visual notebook development, pipeline scheduling, and operational monitoring.

OSS + Object Table – raw media stored in OSS; MaxCompute Object Table automatically captures metadata for both structured and unstructured files, enabling catalog‑style management and fast retrieval.

Workflow

Upload raw audio‑video files to OSS.

Object Table registers each file and extracts metadata (size, format, timestamps, custom tags).

In DataWorks, create a MaxFrame job using the visual notebook or Python SDK; define operators such as VideoSplit, ASR, and FeatureExtract.

MaxFrame schedules operators onto appropriate compute resources; Rebalance splits large files into chunks to keep memory usage within limits.

Job outputs are stored back to OSS and registered in the data map for lineage tracking and permission control.

Performance Results

Compute utilization increased by >30% thanks to elastic scaling to tens of thousands of cores during peak loads.

Multimodal processing throughput doubled (≈100% performance gain) with overall pipeline latency reduced significantly.

Operational overhead dropped ~50% because the platform is fully managed PaaS, eliminating the need for custom cluster maintenance.

Key Takeaways

The integration of MaxCompute, MaxFrame, and DataWorks provides a cloud‑native, end‑to‑end solution for large‑scale multimodal data preprocessing. It resolves resource elasticity, heterogeneous scheduling, and unified metadata management, enabling AI teams to focus on model development rather than infrastructure.

cloud-nativeMaxComputeMaxFrameAI Data Platformmultimodal processing
Alibaba Cloud Big Data AI Platform
Written by

Alibaba Cloud Big Data AI Platform

The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.