Tag

MaxCompute

0 views collected around this technical thread.

DataFunSummit
DataFunSummit
Sep 25, 2024 · Big Data

Evolution of Big Data AI Development Paradigm and Alibaba Cloud’s Integrated Architecture

This article examines how large‑scale big‑data platforms can simplify AI application development, outlines the shift from model‑centric to data‑centric paradigms, and shares Alibaba Cloud’s practical experiences in building an integrated big‑data‑AI architecture, including MaxCompute, Hologres, MaxFrame, and vector search capabilities.

AI integrationBig DataData+AI
0 likes · 19 min read
Evolution of Big Data AI Development Paradigm and Alibaba Cloud’s Integrated Architecture
DaTaobao Tech
DaTaobao Tech
Sep 13, 2024 · Big Data

Extending PyODPS with PAI‑Designer for Dynamic Offline Data Processing

By integrating PAI‑Designer with PyODPS, users can build visual offline workflows that overcome ODPS’s lack of network access, dynamic configuration, and image‑processing limits, using reusable Python components, OSS role‑ARNs, remote configuration fetching, and custom Docker images to read/write MaxCompute and OSS data.

Data ProcessingDockerMaxCompute
0 likes · 19 min read
Extending PyODPS with PAI‑Designer for Dynamic Offline Data Processing
DaTaobao Tech
DaTaobao Tech
Sep 11, 2024 · Big Data

Practical Guide to Using PyODPS for Flexible Data Processing

The article walks through a first‑time user’s experience with PyODPS, showing how its Python‑based DataFrame API offers more flexible JSON field statistics, multi‑condition filtering, and custom aggregations than traditional ODPS SQL, while noting a steep learning curve and syntax quirks.

Data ProcessingMaxComputePyODPS
0 likes · 11 min read
Practical Guide to Using PyODPS for Flexible Data Processing
DaTaobao Tech
DaTaobao Tech
Jul 10, 2024 · Big Data

ODPS Development Guide: Parameters, Built‑in Functions, UDF Creation, and Performance Optimization

This comprehensive ODPS (MaxCompute) development guide serves as a mini‑encyclopedia, detailing common parameter tuning, built‑in SQL functions, step‑by‑step Java UDF creation, job lifecycle insights, and practical performance‑optimization techniques such as parallelism adjustment, map‑join hints, and small‑file mitigation.

Big DataMaxComputeODPS
0 likes · 26 min read
ODPS Development Guide: Parameters, Built‑in Functions, UDF Creation, and Performance Optimization
DataFunSummit
DataFunSummit
Jul 9, 2024 · Big Data

Materialized Views in MaxCompute: Design, Implementation, and Best Practices

This article explains the concept, advantages, and drawbacks of materialized views, describes how MaxCompute implements them—including creation syntax, maintenance properties, automatic query rewrite, smart recommendation, and auto‑materialization—and shares performance results and future improvement plans.

Automatic RefreshBig DataMaxCompute
0 likes · 13 min read
Materialized Views in MaxCompute: Design, Implementation, and Best Practices
DataFunTalk
DataFunTalk
Jun 26, 2024 · Big Data

Evolution of the Big Data + AI Development Paradigm and Alibaba Cloud’s Integrated Architecture

This article examines how the big‑data AI development paradigm has shifted from model‑centric to data‑centric workflows, outlines the challenges of integrating data and AI teams, and details Alibaba Cloud’s end‑to‑end, serverless big‑data platform—including MaxCompute, Hologres, MaxFrame, Object Table, and vector search—designed to accelerate large‑scale AI applications.

AI integrationBig DataMaxCompute
0 likes · 20 min read
Evolution of the Big Data + AI Development Paradigm and Alibaba Cloud’s Integrated Architecture
DataFunTalk
DataFunTalk
Apr 16, 2024 · Big Data

Materialized Views in MaxCompute: Design, Implementation, and Best Practices

This article explains how MaxCompute leverages materialized views as a query accelerator, covering their history, advantages and drawbacks, creation and maintenance details, automatic query rewriting, intelligent recommendation, auto‑materialization, and future enhancements for large‑scale data warehousing.

Automatic RefreshBig DataIntelligent Recommendation
0 likes · 13 min read
Materialized Views in MaxCompute: Design, Implementation, and Best Practices
DataFunTalk
DataFunTalk
Jan 1, 2024 · Big Data

MaxCompute Semi-Structured Data: Concepts, Solutions, and Benefits

This article explains the nature of semi‑structured data, compares traditional schema‑on‑read and schema‑on‑write approaches, and details MaxCompute's columnar storage solution that balances flexibility, performance, and cost for large‑scale data warehouses.

Big DataData WarehouseMaxCompute
0 likes · 19 min read
MaxCompute Semi-Structured Data: Concepts, Solutions, and Benefits
DataFunTalk
DataFunTalk
Nov 12, 2023 · Big Data

MaxCompute Incremental Update Architecture, Intelligent Materialized Views, and Adaptive Execution Optimizations

This article presents a comprehensive overview of MaxCompute's near‑real‑time incremental update and processing architecture, the design of Transactional Table 2.0, intelligent materialized view evolution and recommendation, as well as multi‑level adaptive execution optimizations for the SQL engine, illustrating how these innovations improve efficiency, cost, and scalability for large‑scale data workloads.

Adaptive ExecutionBig DataIncremental Update
0 likes · 20 min read
MaxCompute Incremental Update Architecture, Intelligent Materialized Views, and Adaptive Execution Optimizations
DaTaobao Tech
DaTaobao Tech
Oct 11, 2023 · Big Data

Fundamental Data Skills and Complex Query Techniques in MaxCompute

The article teaches developers essential MaxCompute data‑processing skills—from creating and naming tables, handling strings and dates, and writing basic SELECTs, joins, and aggregations, to employing advanced techniques such as temporary tables, CTEs, partitioning, and map‑join hints for efficient complex queries.

Big DataData EngineeringETL
0 likes · 15 min read
Fundamental Data Skills and Complex Query Techniques in MaxCompute
DataFunSummit
DataFunSummit
Sep 7, 2023 · Big Data

MaxCompute Semi-Structured Data Solutions: Architecture, Comparison, and Performance Benefits

This article explains the concepts of semi‑structured data, compares traditional schema‑on‑read and schema‑on‑write approaches, and details MaxCompute's columnar storage solution—including AliORC, adaptive query processing, and handling of dirty or sparse data—to achieve high performance and low cost in big‑data warehousing.

MaxComputeSemi-Structured Datacolumnar storage
0 likes · 20 min read
MaxCompute Semi-Structured Data Solutions: Architecture, Comparison, and Performance Benefits
DataFunTalk
DataFunTalk
Aug 29, 2023 · Big Data

MaxCompute Incremental Update, Processing Architecture, and Intelligent Data Warehouse Optimizations

This article presents a comprehensive overview of MaxCompute's incremental update and processing architecture, the design of intelligent materialized views, and the engine's adaptive execution optimizations, detailing the integrated near‑real‑time and batch pipelines, transactional table 2.0, and practical Q&A.

Adaptive ExecutionBig DataData Warehouse
0 likes · 21 min read
MaxCompute Incremental Update, Processing Architecture, and Intelligent Data Warehouse Optimizations
DataFunTalk
DataFunTalk
Jun 24, 2023 · Big Data

Design and Architecture of MaxCompute Lakehouse Near‑Real‑Time Incremental Processing

This article explains the evolution of Alibaba Cloud's MaxCompute platform into a lakehouse architecture that supports near‑real‑time incremental processing, detailing its development history, core design of transactional tables, five‑module technical stack, data ingestion methods, optimization services, transaction management, query capabilities, ecosystem integration, practical applications, future roadmap, and common user questions.

Big DataIncremental ProcessingLakehouse
0 likes · 24 min read
Design and Architecture of MaxCompute Lakehouse Near‑Real‑Time Incremental Processing
DataFunTalk
DataFunTalk
Dec 13, 2022 · Artificial Intelligence

End-to-End Machine Learning Application Using OpenMLDB and Alibaba Cloud MaxCompute

This article demonstrates how to build a complete end-to-end machine-learning workflow for taxi trip duration prediction by integrating OpenMLDB with Alibaba Cloud MaxCompute’s serverless services, covering environment setup, offline data ingestion, feature extraction, model training, deployment, and real-time online inference within 20 ms.

Feature StoreMaxComputeOnline Inference
0 likes · 13 min read
End-to-End Machine Learning Application Using OpenMLDB and Alibaba Cloud MaxCompute
DataFunTalk
DataFunTalk
Aug 19, 2022 · Big Data

Multi‑Tenant Architecture in Public‑Cloud Big Data Platforms: Design, Challenges, and MaxCompute Implementation

This article explains the different multi‑tenant models used by public‑cloud big data platforms, analyzes their advantages and challenges, and details how Alibaba Cloud's MaxCompute realizes strong multi‑tenancy through storage design, resource scheduling, security containers, virtual networking, and future evolution directions.

Big DataData SecurityMaxCompute
0 likes · 15 min read
Multi‑Tenant Architecture in Public‑Cloud Big Data Platforms: Design, Challenges, and MaxCompute Implementation
DaTaobao Tech
DaTaobao Tech
Apr 27, 2022 · Big Data

Comparative Study of JSON Processing Methods in MaxCompute

The study compares MaxCompute JSON extraction functions—FROM_JSON, get_json_object, and custom JMESPath/JSONPath UDFs—showing simple field extraction with get_json_object is fastest, while complex queries benefit from FROM_JSON or JMESPath, and outlines corresponding JSON generation methods and best‑practice recommendations.

DataEngineeringJMESPathJSON
0 likes · 11 min read
Comparative Study of JSON Processing Methods in MaxCompute
Fulu Network R&D Team
Fulu Network R&D Team
Jul 19, 2021 · Databases

Ubiquitous Data: From Analysis to Data Warehousing and Dimension Modeling with Tableau and MaxCompute

This article explains how data-driven decision making evolves from simple Excel and Python analysis to robust data warehouses, introduces star‑schema dimension modeling, and demonstrates practical integration of Alibaba Cloud MaxCompute with Tableau for interactive OLAP‑style analytics.

Big DataData WarehousingDimension Modeling
0 likes · 7 min read
Ubiquitous Data: From Analysis to Data Warehousing and Dimension Modeling with Tableau and MaxCompute
Sohu Tech Products
Sohu Tech Products
Apr 7, 2021 · Big Data

Data Warehouse Architecture and Modeling with Alibaba MaxCompute and DataWorks

This tutorial explains how to select a technical architecture, design a three‑layer data warehouse (ODS, CDM, ADS), model tables and dimensions, choose storage strategies, handle slowly changing dimensions, synchronize data with DataWorks, and implement dimensional modeling and fact tables using Alibaba MaxCompute for big‑data analytics.

Big DataData WarehouseDataWorks
0 likes · 32 min read
Data Warehouse Architecture and Modeling with Alibaba MaxCompute and DataWorks
Youku Technology
Youku Technology
Aug 15, 2019 · Big Data

Youku's Migration from Hadoop to Alibaba Cloud MaxCompute: Benefits and Technical Insights

Youku’s 2017 migration from an on‑premises Hadoop cluster to Alibaba Cloud MaxCompute delivered a unified, elastic data pipeline that cut compute and storage costs by roughly half, handled billions of daily log records, boosted performance and scalability, and empowered analysts with self‑service tools and a rich ecosystem.

Cost OptimizationMaxComputeYouku
0 likes · 12 min read
Youku's Migration from Hadoop to Alibaba Cloud MaxCompute: Benefits and Technical Insights
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Nov 20, 2018 · Big Data

A Decade of Alibaba's Big Data Platform Evolution Through Double 11

The article chronicles Alibaba's ten‑year journey of building and scaling its big data platform—from early Oracle clusters and Hadoop‑based Cloud‑Ladder 1 to the self‑developed ODPS/MaxCompute, real‑time Blink engine, and the unified DataWorks ecosystem—highlighting key technical milestones, performance breakthroughs, and operational challenges that powered successive Double 11 shopping festivals.

AlibabaBig DataData Engineering
0 likes · 22 min read
A Decade of Alibaba's Big Data Platform Evolution Through Double 11