How to Build Cost‑Effective Big Data Platforms in the Data+AI Era
This article summarizes Liu Yiming's 2023 Cloud Conference talk on constructing modern big‑data platforms, covering cost‑saving strategies, serverless operations, open lake‑warehouse architecture, AI‑driven optimization, and the evolving role of data engineers in the Data+AI era.
This article is based on Liu Yiming’s 2023 Cloud Conference speech, where he discussed how to build big‑data platforms in the Data+AI era.
Cost‑Reduction Capability: Flexible Billing Drives Significant Savings
Reducing costs is a core capability for any big‑data platform, especially on public clouds. Alibaba Cloud offers multiple payment models for MaxCompute, such as prepaid (annual/monthly) and pay‑as‑you‑go. By combining these models with time‑based elasticity—running low‑priority jobs during off‑peak hours—customers can lower unit costs. Since September 20, the CU price for elastic resources was cut by 50%, and SpotJob (idle‑time jobs) are priced at one‑third of the regular pay‑as‑you‑go rate, achieving up to 66% cost reduction for latency‑insensitive workloads.
Light‑Operation Capability: Serverless Transforms Big‑Data Ops
Serverless architecture simplifies operations by removing manual tasks such as capacity planning, upgrades, backups, and disaster recovery. MaxCompute adopts a Shared‑Everything model, while Hologres uses Shared‑Data, each providing isolation and scalability suited to their workloads. This enables zero‑downtime upgrades and reduces operational overhead, allowing engineers to focus on value‑creation rather than routine maintenance.
Open Capability: Lake‑Warehouse Integration and Openness
Alibaba Cloud embraces Open Storage + Open Format, offering a lake‑warehouse solution that provides native metadata management and data access. MaxCompute treats storage as an independent product with a Storage API, enabling third‑party engines like Spark or Presto to access data directly. This openness encourages user‑driven innovation.
Intelligent Optimization: AI‑Powered Smart Data Warehouse
AI assists in automating optimization. MaxCompute recommends materialized views based on usage patterns, achieving significant cost and performance gains. This shift moves optimization from manual DBA expertise to intelligent, cloud‑native services.
Big Data as the Foundation for AI
Big data serves as essential infrastructure for AI, providing scalable processing, distributed computing, and a unified development environment. MaxCompute now supports Python as a first‑class language with a notebook experience, and introduces MaxFrame, a Pandas‑compatible distributed framework, bridging big‑data and AI workflows.
Hologres also integrates the Proxima vector engine, offering high‑performance, real‑time vector search via SQL, further supporting AI applications.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Big Data AI Platform
The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
