Tagged articles
9 articles
Page 1 of 1
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
May 24, 2024 · Artificial Intelligence

How DeepRec Extension Boosts Distributed Sparse Model Training with Elasticity and Fault Tolerance

DeepRec Extension enhances large‑scale sparse model training by adding automatic elastic training, resource‑aware scheduling, real‑time monitoring, and efficient fault‑tolerance mechanisms, enabling lower cost, higher throughput, and more reliable distributed training for AI workloads.

AI InfrastructureDeepRecSparse Models
0 likes · 13 min read
How DeepRec Extension Boosts Distributed Sparse Model Training with Elasticity and Fault Tolerance
Ximalaya Technology Team
Ximalaya Technology Team
Oct 9, 2023 · Artificial Intelligence

DeepRec-Based High-Dimensional Sparse Feature Support and Real-Time Model Training in Ximalaya AI Cloud

Ximalaya AI Cloud leverages DeepRec’s Embedding Variable to elastically manage high‑dimensional sparse features with low collision, supporting admission/eviction, multi‑level storage and minute‑level incremental model updates, which together boost GPU utilization, halve training time and improve recommendation CTR by 2‑3 % while maintaining latency.

AI cloudDeepRecKubernetes
0 likes · 13 min read
DeepRec-Based High-Dimensional Sparse Feature Support and Real-Time Model Training in Ximalaya AI Cloud
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Apr 11, 2023 · Artificial Intelligence

How DeepRec Boosted Sparse Model Training and Inference for Large‑Scale Recommendations

This article details how the metaapp advertising team adopted Alibaba Cloud's open‑source DeepRec to overcome parameter‑server bottlenecks, compress terabyte‑scale embeddings, leverage GPU‑accelerated distributed training, and build a low‑maintenance, high‑performance inference service using DeepRec's Processor and oneDNN optimizations.

DeepRecDistributed TrainingEmbeddingVariable
0 likes · 13 min read
How DeepRec Boosted Sparse Model Training and Inference for Large‑Scale Recommendations
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Mar 8, 2023 · Artificial Intelligence

How DeepRec Cut Ximalaya AI Cloud Training Time by 50% and Boosted CTR

Ximalaya’s AI Cloud platform leverages Alibaba’s DeepRec to tackle high‑dimensional sparse feature challenges, accelerate model training by over 50%, enable minute‑level model updates, and improve recommendation metrics, while outlining implementation details, multi‑tier storage, real‑time training, and future inference enhancements.

AI cloudDeepRecModel Training Optimization
0 likes · 12 min read
How DeepRec Cut Ximalaya AI Cloud Training Time by 50% and Boosted CTR
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Dec 22, 2022 · Artificial Intelligence

How DeepRec Supercharges Weibo’s Hot Recommendation Engine

This article explains the architecture of Weibo's popular recommendation system, the role of the weidl online learning platform, and how DeepRec’s performance optimizations—such as oneDNN operator acceleration, cost‑aware scheduling, and adaptive memory allocation—significantly improve training speed, inference latency, and overall service throughput.

AIDeepRecOnline Learning
0 likes · 15 min read
How DeepRec Supercharges Weibo’s Hot Recommendation Engine
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Dec 15, 2022 · Artificial Intelligence

Vivo’s DeepRec: Dynamic Embedding and GPU Tricks that Raised CTR by 1.2%

Vivo’s AI recommendation team leveraged Alibaba’s DeepRec engine—introducing dynamic Embedding Variables, feature admission/elimination, Parquet datasets, and advanced CPU/GPU inference optimizations such as SessionGroup, device placement, multi‑stream and BladeDISC compilation—resulting in notable gains in model accuracy, latency reduction, and resource efficiency.

DeepRecGPU inferenceRecommendation Systems
0 likes · 13 min read
Vivo’s DeepRec: Dynamic Embedding and GPU Tricks that Raised CTR by 1.2%
DataFunTalk
DataFunTalk
Apr 17, 2022 · Artificial Intelligence

DeepRec: Alibaba’s Sparse Model Training Engine – Architecture, Features, and Open‑Source Status

DeepRec, developed since 2016 by Alibaba, is a specialized sparse‑model training engine that addresses feature elasticity, training performance, and deployment challenges through dynamic elastic features, optimized runtimes, distributed training frameworks, incremental model export, and multi‑level storage, and is now being open‑sourced for broader industry collaboration.

AI InfrastructureDeepRecRuntime Optimization
0 likes · 15 min read
DeepRec: Alibaba’s Sparse Model Training Engine – Architecture, Features, and Open‑Source Status