Tagged articles
3 articles
Page 1 of 1
Data Party THU
Data Party THU
Oct 15, 2025 · Artificial Intelligence

Designing Safe, Sample-Efficient, and Robust Reinforcement Learning for Ranking and Diffusion Models

This paper proposes a reinforcement‑learning framework that simultaneously ensures safety, sample efficiency, and robustness, applying a contextual‑bandit perspective to ranking/recommendation systems and text‑to‑image diffusion models, and introduces novel algorithms for safe deployment, variance‑reduced off‑policy estimation, and a LOOP method for generative RL.

RobustnessSafetycontextual bandits
0 likes · 5 min read
Designing Safe, Sample-Efficient, and Robust Reinforcement Learning for Ranking and Diffusion Models
DataFunTalk
DataFunTalk
Feb 21, 2021 · Artificial Intelligence

Advances in Pre‑Ranking for Large‑Scale Advertising: The COLD Framework and Its Technical Evolution

This article reviews the development history, technical routes, and recent breakthroughs of pre‑ranking (coarse ranking) in large‑scale advertising systems, focusing on Alibaba's COLD (Computing‑power‑cost‑aware Online and Lightweight Deep) framework, its model design, engineering optimizations, experimental results, and future research directions.

AdvertisingCOLDOnline Learning
0 likes · 20 min read
Advances in Pre‑Ranking for Large‑Scale Advertising: The COLD Framework and Its Technical Evolution
DataFunTalk
DataFunTalk
Sep 3, 2020 · Artificial Intelligence

Deep Learning Practices for Click‑Through‑Rate Prediction and Ranking at 58.com

This article describes how 58.com applied deep‑learning techniques—including feature engineering, sample construction, model evolution from Wide&Deep to DIN/DIEN and multi‑task learning—and system‑level optimizations to improve CTR/CPM performance in its large‑scale commercial ranking platform.

CTR predictionDeep LearningSystem optimization
0 likes · 38 min read
Deep Learning Practices for Click‑Through‑Rate Prediction and Ranking at 58.com