Tag

actor-critic

0 views collected around this technical thread.

DataFunTalk
DataFunTalk
Mar 30, 2024 · Artificial Intelligence

Reinforcement Learning and Multi‑Task Recommendation: Two‑Stage Constrained Actor‑Critic and Multi‑Task RL Approaches at Kuaishou

This talk presents Kuaishou's research on combining reinforcement learning with multi‑task recommendation, detailing a two‑stage constrained actor‑critic method for short‑video ranking, a multi‑task RL framework, experimental results on offline and online systems, and practical Q&A insights.

Kuaishouactor-criticmulti-task recommendation
0 likes · 18 min read
Reinforcement Learning and Multi‑Task Recommendation: Two‑Stage Constrained Actor‑Critic and Multi‑Task RL Approaches at Kuaishou
Sohu Tech Products
Sohu Tech Products
Nov 8, 2023 · Artificial Intelligence

Two‑Stage Constrained Actor‑Critic for Short‑Video Recommendation and a Reinforcement‑Learning Multi‑Task Recommendation Framework

The presentation introduces a two‑stage constrained actor‑critic algorithm that learns auxiliary policies for interaction signals before optimizing watch‑time under KL constraints, and a reinforcement‑learning multi‑task learning framework that models session‑level dynamics with adaptive multi‑critic weighting, both achieving significant offline and online gains in short‑video recommendation.

Recommendation systemsactor-criticconstrained optimization
0 likes · 16 min read
Two‑Stage Constrained Actor‑Critic for Short‑Video Recommendation and a Reinforcement‑Learning Multi‑Task Recommendation Framework
Kuaishou Tech
Kuaishou Tech
Apr 27, 2023 · Artificial Intelligence

Two-Stage Constrained Actor‑Critic (TSCAC) for Short‑Video Recommendation

The paper models short‑video recommendation as a constrained Markov decision process and introduces a two‑stage constrained actor‑critic algorithm that jointly maximizes watch time while satisfying multiple interaction constraints, demonstrating superior offline and online performance on the KuaiRand dataset and Kuaishou app.

Recommendation systemsactor-criticconstrained optimization
0 likes · 7 min read
Two-Stage Constrained Actor‑Critic (TSCAC) for Short‑Video Recommendation
HomeTech
HomeTech
Nov 16, 2022 · Artificial Intelligence

Fundamentals and Policy Gradient Algorithms in Reinforcement Learning with Applications to Scene Text Recognition

This article introduces the basic concepts of reinforcement learning, derives model‑based and model‑free policy gradient methods—including vanilla policy gradient and Actor‑Critic—explains their mathematical foundations, and demonstrates their use in scene text recognition and image captioning tasks.

actor-criticaiattention mechanism
0 likes · 22 min read
Fundamentals and Policy Gradient Algorithms in Reinforcement Learning with Applications to Scene Text Recognition
DaTaobao Tech
DaTaobao Tech
Aug 18, 2022 · Artificial Intelligence

Introduction to Deep Reinforcement Learning: Theory, Algorithms, and Applications

This article introduces deep reinforcement learning by explaining its Markov decision process foundations, then categorizes the main algorithm families—value‑based methods like DQN, policy‑based approaches such as PG/DPG/DDPG, and actor‑critic techniques including A3C, PPO, and DDPG—detailing their architectures, training procedures, and key advantages.

DQNMDPactor-critic
0 likes · 14 min read
Introduction to Deep Reinforcement Learning: Theory, Algorithms, and Applications
IEG Growth Platform Technology Team
IEG Growth Platform Technology Team
Aug 16, 2022 · Artificial Intelligence

Actor‑Critic Reinforcement Learning for Real‑Time Bidding in Mobile Game Advertising

The paper proposes an actor‑critic reinforcement‑learning model (ACRL) that leverages PPO and a deep structured semantic model to optimize real‑time bidding strategies for mobile game ads under CPM and budget constraints, addressing long user lifecycles and sparse conversion data while demonstrably improving ROI in both offline simulations and online A/B tests.

Mobile AdvertisingROIReal-Time Bidding
0 likes · 16 min read
Actor‑Critic Reinforcement Learning for Real‑Time Bidding in Mobile Game Advertising