Big Data 12 min read

How Youku Cut Costs and Boost Performance by Migrating to MaxCompute

This article explains how Youku processed billions of daily logs, migrated from Hadoop to Alibaba Cloud MaxCompute in 2017, and achieved lower compute and storage costs, faster data delivery, and greater operational flexibility through a robust big‑data platform tailored to its complex business needs.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
How Youku Cut Costs and Boost Performance by Migrating to MaxCompute

In May 2017, Youku, which generates trillions of daily log entries, completed a migration from Hadoop to Alibaba Cloud MaxCompute, resulting in decreasing compute and storage consumption and significant cost savings.

Business Characteristics of Youku

High user complexity : data platform used by data engineers, BI analysts, testers, product and operations teams.

Complex business scenarios : video streaming, live broadcast, membership, advertising, large‑screen services, with diverse log types.

Massive data volume : daily logs reach the hundred‑billion level, requiring intensive computation.

Cost‑sensitive and elastic demand : strict budgeting and frequent high‑traffic campaigns (e.g., Double‑11, World Cup) demand flexible resource allocation.

Why MaxCompute Fits These Needs

Simple to use – a complete data pipeline that eliminates nightly cluster maintenance and enables rapid job execution.

Rich ecosystem – integrates with MySQL, HBase, Elasticsearch, Redis, DataWorks, QuickBI, and other Alibaba services.

Strong performance – supports exabyte‑scale storage, billions of records analysis, and tens of thousands of concurrent tasks.

Elastic resource usage – pay‑as‑you‑go model, time‑slice scheduling, and instant scaling for burst workloads.

After migration, Youku no longer needed to manually maintain Hadoop clusters; analysts could run queries on demand, and critical reports were generated by 7 am instead of late night.

Typical Use Cases

Data warehouse layering (ODS → CDM → ADS) enables unified data services, while machine‑learning platforms (PAI) train models on MaxCompute and serve results via OSS. Anti‑fraud solutions extract features from raw logs, apply ML/DL models, and iteratively improve detection.

Advanced Optimizations

HBO : automatic parameter tuning for faster jobs.

Hash Cluster : pre‑sorting large‑table joins to reduce computation.

AliORC : improves I/O efficiency by ~20%.

Session & Lightning : SSD/caching and MPP architecture for low‑latency queries.

Storage optimization includes data lifecycle management, compression, and field splitting to control exponential growth of immutable raw data.

Overall, MaxCompute provided Youku with a cost‑effective, high‑performance, and elastic big‑data platform that supports diverse business scenarios and continuous optimization.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Data MigrationperformanceCost Optimizationdata-warehouseMaxComputeYouku
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.