How Alibaba’s MaxCompute Became the Backbone of 99% Data Processing
This article reviews Alibaba's MaxCompute evolution from ODPS to a unified, multi‑cluster big‑data platform, detailing its architecture, development tools, large‑scale deployments, performance optimizations, typical workload scenarios, and why it is the preferred choice for enterprise data processing.
Overview of Alibaba Cloud Big Data Computing Service
MaxCompute, formerly known as ODPS, is Alibaba's internal unified big‑data platform that has evolved into the core data storage and compute engine for nearly 99% of the company's data and 95% of its compute capacity.
Every day more than 14,000 internal developers use the platform, executing over three million jobs covering use cases such as Alipay credit scoring, Taobao merchant billing, and the massive traffic handling of Double‑11.
The platform runs on tens of thousands of servers across multiple regions, offering multi‑cluster disaster‑recovery, rapid user growth (250% annual increase), and deployments on both public and private clouds for government, security, and city‑brain projects.
Technical Architecture
At the lowest layer is the compute engine, connected to a data bus called DataHub that ingests data into MaxCompute. Above the compute layer are development suites like DataWorks and MaxCompute Studio, providing data management, job development, and management capabilities.
Integrated services include AI platforms, voice‑to‑text, OCR, machine translation, and intelligent brain products, which together form a complete data processing ecosystem serving both internal Alibaba services and external customers.
Evolution of Alibaba’s Data Platform
Initially, Alibaba relied on Oracle clusters (the "Oracle peak") and later introduced Greenplum as a secondary solution when Oracle reached its limits. By 2009, the need for a more scalable system led to the launch of Alibaba Cloud, which built three core components: the distributed storage system Pangu, the scheduler Fuxi, and the big‑data service ODPS (now MaxCompute).
After a year of development, the first ODPS platform was operational. By 2012 it achieved stable unified storage, standardization, and security, and in 2013 it entered large‑scale commercial use, breaking the 5,000‑node barrier and supporting multi‑cluster capabilities.
In 2014‑2015, Alibaba unified its two parallel data‑processing stacks (cloud‑ladder 1 based on open‑source Hadoop and cloud‑ladder 2 self‑developed) through the "Moon Landing" project, emphasizing multi‑cluster ability, strong security, and petabyte‑scale processing with financial‑grade stability.
Guarantee Hadoop‑compatible functionality and performance.
Provide programming‑model compatibility.
Offer comprehensive migration and comparison tools.
Enable seamless in‑flight upgrades.
Moon Landing Project – Unified Process
The project consolidated dozens of disparate computing platforms across business units into a single, unified data platform, improving resource utilization, data flow efficiency, and overall operational cost.
Key outcomes include:
Enterprise‑wide unified big‑data platform with EB‑scale storage and millions of daily tasks.
Fine‑grained security and multi‑tenant data protection.
High performance, comprehensive data unification, and optimized storage tiers (memory, SSD, HDD, cold storage).
MaxCompute 2.0 – Ongoing Upgrades
Introduced at the 2016 Cloud Expo, MaxCompute 2.0 added a new SQL engine, unstructured data processing, and support for multiple compute modes (batch, interactive, in‑memory, iterative). A forthcoming query language, NewSQL, combines declarative and imperative features.
Engine improvements include cost‑based and history‑based optimization, fully asynchronous I/O, bubble‑based scheduling for efficient resource usage, and tighter integration with Hadoop and Spark ecosystems.
Storage enhancements feature AliORC (compatible with native ORC but faster) and hierarchical tiered storage, with SSD, HDD, and cold‑storage layers.
Typical Big‑Data Workloads
Workloads are categorized into three main types:
Batch/Workflow: Periodic jobs (daily, hourly, monthly) handling large data volumes.
Interactive Analysis: Ad‑hoc queries for business decisions, requiring low latency (seconds to tens of seconds) and moderate data size.
Streaming/Real‑time: Low‑latency processing for events such as Double‑11 dashboards.
Key technical considerations include data ingestion throttling, integrity checks, fault‑tolerant data补 (recovery), real‑time debugging, and high‑availability scheduling.
BI‑Focused Optimizations
For interactive BI scenarios, MaxCompute employs online‑job designs featuring long‑living processes, process reuse, direct network connections (avoiding disk I/O), event‑driven scheduling, and automatic failover based on statistical models.
Performance Benchmarks
In collaboration with Intel, MaxCompute was evaluated on the 2017 BigBench benchmark, executing over 30 queries (SQL, MapReduce, machine‑learning) at scales from 10 TB to 100 TB, achieving the first engine to reach 7,000 points and demonstrating superior cost‑performance on public cloud.
Why Choose MaxCompute
Out‑of‑the‑box scalability without user‑managed sizing.
Proven performance and cost‑efficiency through extensive benchmarks.
Robust multi‑tenant security built on Alibaba’s internal safeguards.
Support for multiple distributed compute models.
Comprehensive migration tools and ecosystem integration (DataWorks, Studio, AI platforms, recommendation engines, reporting tools).
Data already on Alibaba Cloud can be migrated to MaxCompute via various methods (direct sync, VPN, dedicated lines). Once in the platform, developers can leverage Data IDE, plugins, and seamless integration with machine‑learning and analytics services.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
