How Alibaba Cloud’s Integrated Big Data & AI Platform Is Evolving
The talk outlines the evolution of Alibaba Cloud’s integrated big data and AI platform, highlighting the three‑V fundamentals, the AI‑inspired usability‑scale‑efficiency triangle, open‑source trends, and how the platform unifies offline and real‑time analytics while simplifying governance and development.
In the 2022 Cloud Expo Integrated Big Data & AI Summit, speaker Jia Yangqing discussed how advances in artificial‑intelligence algorithms have driven explosive demand for big‑data solutions, prompting a focus on the innovative sparks between data and intelligence.
Big data, though not new, has been shaped since the 1990s by the three V’s—Volume, Velocity, and Variety—guiding the construction of robust systems across storage, compute, scheduling, and services.
Just as AI faces an “impossible triangle” of usability, scale, and efficiency, big‑data platforms confront the same challenges:
Usability : Providing flexible, user‑friendly tools so analysts can retrieve insights with minimal SQL or even no‑code.
Scale : Supporting massive workloads (Alibaba Cloud processes over 10 EB of daily computation) while reducing platform complexity and cost.
Efficiency : Ensuring that computed results are actually consumed, improving governance, quality control, and overall organizational efficiency.
Alibaba’s approach began with open‑source Hadoop clusters and the proprietary ODPS system, moving to a cloud‑native architecture that centralizes enterprise data, breaks data silos, and builds a full data ecosystem from scratch.
By managing end‑to‑end data tasks, the platform delivers low‑cost, high‑performance growth, tackling performance challenges such as faster SQL execution and optimized storage‑compute balance.
Diverse compute needs—offline batch, real‑time streaming, and OLAP—are addressed through an integrated design that reconciles resource utilization with analytical efficiency.
Alibaba Cloud also lowers the development threshold by offering a comprehensive suite for development, operations, modeling, and governance, giving developers a panoramic view from code to system maintenance and enterprises a unified data‑management perspective.
Open‑source remains a major trend: Alibaba provides cloud‑native experiences identical to Hadoop, Hive, Spark, and data‑lake stacks, while adding enterprise‑grade stability, elasticity, and serverless capabilities. Notably, the recently contributed Celeborn project dramatically improves data‑shuffle performance for data lakes.
The integrated platform combines MaxCompute (offline, scale‑focused) and Hologres (real‑time analytics) into a unified ODPS system, enabling “auto‑driving” data pipelines that abstract away engine and storage details, and seamlessly bridge open‑source data lakes with proprietary warehouses.
DataWorks has been upgraded to support multiple engines, accelerate data modeling and governance, and expose OpenAPI for easier secondary development.
While traditional analytics remain dominant, AI applications—deep learning for vision, speech, NLP, intelligent search, and recommendation—are increasingly intertwined with big data, requiring platforms that handle unstructured data at scale.
Alibaba’s AI platform PAI addresses this integration, offering solutions such as the open‑source ModelScope, high‑performance autonomous‑driving compute, and intelligent recommendation systems.
The presentation concluded with a diagram summarizing the full product ecosystem, emphasizing the continuous evolution of big‑data technology and its limitless possibilities when combined with AI.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Big Data AI Platform
The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
