Artificial Intelligence 17 min read

Building Large‑Scale Recommendation Systems with Big Data and Large Language Models on Alibaba Cloud AI Platform

This presentation details how Alibaba Cloud's AI platform integrates big‑data pipelines, feature‑store services, and large language model capabilities to construct high‑performance search‑recommendation architectures, covering system design, training and inference optimizations, LLM‑driven use cases, and open‑source RAG tooling.

DataFunSummit

Feb 14, 2025

Building Large‑Scale Recommendation Systems with Big Data and Large Language Models on Alibaba Cloud AI Platform

Shi Xing, product architecture lead of Alibaba Cloud Machine Learning Platform PAI, introduces the team’s work on search‑recommendation, multimodal image/video processing, large‑model platform optimization, and RAG engineering.

The talk outlines the mature search‑recommendation advertising architecture used by major e‑commerce apps, describing how user requests trigger business engines, A/B systems, recall, coarse‑ranking and fine‑ranking, and how models such as DeepFM rely on feature platforms and real‑time data streams.

At the resource layer, Alibaba Cloud provides heterogeneous compute (CPU, GPU, high‑bandwidth RDMA network, high‑performance storage) via clusters like ODPS Feitian, container services, and the Lingjun intelligent compute cluster, forming the foundation for a unified "big data + AI" PaaS platform.

The big‑data platform comprises MaxCompute (Hadoop‑like), Hologres (real‑time OLAP), Flink for streaming, and EMR as the open‑source counterpart, enabling both real‑time and offline data processing.

The AI platform offers data labeling (PAI‑iTAG), data cleaning, a FeatureStore, interactive development (PAI‑DSW), visual development (PAI‑Designer), distributed training (PAI‑DLC), dataset acceleration, network and operator optimizations, and model serving via PAI‑EAS.

FeatureStore serves as a structured big‑data management layer for ML features, providing offline‑to‑online synchronization, feature lineage, multi‑level caching, and low‑latency access, crucial for serving embeddings that can reach terabyte scale.

EasyRec, an open‑source recommendation algorithm library, unifies training and inference across MaxCompute, Hadoop, Spark, and local environments, supporting auto‑ML, feature generation, model distillation, and early‑stop to reduce development effort.

Training optimizations include multi‑level caching, feature auto‑eviction, WorkQueue data pipelines, feature selection, knowledge distillation, communication reduction, and hardware acceleration (AVX/AMX, AllReduce, SOK). Inference optimizations comprise AVX/AMX acceleration, bf16/int8 quantization, AutoPlacement, SessionGroup multi‑stream GPU usage, and feature caching, achieving up to four‑fold QPS improvement over native TF‑Serving.

LLM‑driven scenarios such as category recommendation, query rewriting, and prompt engineering are demonstrated, showing how large language models can generate related product categories, rewrite ambiguous queries, and produce domain‑specific synonyms.

The RAG workflow (PAI‑RAG) modularizes document extraction, indexing, pre‑retrieval query rewriting, retrieval, post‑retrieval processing, generation, and evaluation, supporting multimodal inputs, OCR, and vector databases (ElasticSearch, Hologres, Milvus), with tools for building evaluation sets via RefGPT.

Open‑source projects like EasyRec, EasyPhoto, EasyAnimate, and PAI‑RAG are highlighted, inviting the community to contribute and use these tools for large‑scale AI applications.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data Large Language Models RAG Recommendation Systems distributed training AI platform Feature Store

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.