Alibaba Cloud Big Data AI Platform
Author

Alibaba Cloud Big Data AI Platform

The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.

455
Articles
0
Likes
387
Views
0
Comments
Recent Articles

Latest from Alibaba Cloud Big Data AI Platform

100 recent articles max
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Dec 24, 2025 · Big Data

How Paimon’s Column‑Separation Architecture Powers Real‑Time Multi‑Modal Lakehouse for AI

This article explains the challenges of frequent column changes in AI feature engineering, introduces Paimon’s column‑separation storage with a global continuous Row ID, details its Blob data type for efficient multi‑modal handling, and outlines production results and future roadmap for building an AI‑native data lakehouse.

Apache PaimonBlobColumnar Storage
0 likes · 11 min read
How Paimon’s Column‑Separation Architecture Powers Real‑Time Multi‑Modal Lakehouse for AI
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Dec 23, 2025 · Artificial Intelligence

How Skrull Boosts Long-Context Fine‑Tuning Speed Up to 7.5×

The Skrull system, accepted at NeurIPS 2025, dynamically schedules long and short sequences during each training iteration, overlapping communication and computation to achieve up to 7.54× speedup for long‑context fine‑tuning of large language models while maintaining stability through load‑balancing and rollback mechanisms.

Dynamic Data SchedulingLong Context Fine-TuningModel Training Optimization
0 likes · 8 min read
How Skrull Boosts Long-Context Fine‑Tuning Speed Up to 7.5×
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Dec 18, 2025 · Databases

Why Hologres Dynamic Table Beats Traditional Full Refresh for Real‑Time Data Warehousing

The article explains how Hologres Dynamic Table uses a stateful incremental refresh model to efficiently handle massive historical data with tiny daily updates, dramatically reducing latency and resource consumption compared with conventional full‑refresh pipelines across several real‑world join and aggregation scenarios.

Dynamic TableHologresIncremental Refresh
0 likes · 18 min read
Why Hologres Dynamic Table Beats Traditional Full Refresh for Real‑Time Data Warehousing
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Dec 16, 2025 · Artificial Intelligence

How CosyVoice 2.0 Cuts First‑Chunk Latency for High‑Fidelity Voice Cloning

CosyVoice 2.0, Alibaba DAMO Academy's next‑gen high‑fidelity speech synthesis model, introduces architecture decoupling, streaming generation, reference‑audio caching and dynamic load balancing to dramatically reduce first‑packet latency and improve real‑time factor while supporting multi‑language voice cloning.

AI model optimizationStreaming Inferencelow-latency
0 likes · 9 min read
How CosyVoice 2.0 Cuts First‑Chunk Latency for High‑Fidelity Voice Cloning
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Dec 15, 2025 · Backend Development

Why a Hot‑Word Update Crashed Elasticsearch and How Serverless Index‑Level Dictionaries Fix It

A real‑world incident where adding a hot term to the IK analyzer caused a P0 outage in an e‑commerce search system is dissected, revealing a clash between dynamic dictionary updates and immutable inverted indexes, and showing how Alibaba Cloud Elasticsearch Serverless’s index‑level dictionary isolation eliminates the problem while keeping services uninterrupted.

Hot UpdateIK AnalyzerIndex-level Dictionary
0 likes · 14 min read
Why a Hot‑Word Update Crashed Elasticsearch and How Serverless Index‑Level Dictionaries Fix It
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Dec 5, 2025 · Big Data

How EMR Serverless Spark Cut Batch Processing Time by Over 50% for a 600M‑User Platform

This case study details how Qimao leveraged Alibaba Cloud EMR Serverless Spark with Fusion and Celeborn to overcome multi‑business‑line data‑processing challenges, achieving more than 50% faster batch jobs, significant cost reductions, and improved operational flexibility across its 600 million‑user ecosystem.

Data WarehousePerformance optimizationServerless Spark
0 likes · 9 min read
How EMR Serverless Spark Cut Batch Processing Time by Over 50% for a 600M‑User Platform
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Nov 24, 2025 · Artificial Intelligence

Fine‑Tuning GR00T‑N1.5: From Human Demonstrations to Distributed Imitation Learning

This tutorial walks through fine‑tuning the complex VLA model GR00T‑N1.5 by collecting human demonstrations, annotating and massively augmenting data with DLC, performing distributed imitation learning, and validating the model through a server‑client DSW setup, complete with code snippets, resource specs, and visual examples.

DSWDistributed Imitation LearningGR00T
0 likes · 18 min read
Fine‑Tuning GR00T‑N1.5: From Human Demonstrations to Distributed Imitation Learning