Tagged articles
7 articles
Page 1 of 1
DataFunSummit
DataFunSummit
Dec 27, 2023 · Artificial Intelligence

Two-Stage Constrained Actor-Critic for Short‑Video Recommendation and a Reinforcement‑Learning Multi‑Task Framework

This article presents a two‑stage constrained actor‑critic (TSCAC) algorithm that models short‑video recommendation as a constrained reinforcement‑learning problem, details its theoretical formulation and optimization loss, and validates its superiority through extensive offline and online experiments, followed by a multi‑task reinforcement‑learning framework (RMTL) that further improves multi‑objective recommendation performance.

Recommendation SystemsReinforcement Learningconstrained optimization
0 likes · 16 min read
Two-Stage Constrained Actor-Critic for Short‑Video Recommendation and a Reinforcement‑Learning Multi‑Task Framework
Sohu Tech Products
Sohu Tech Products
Nov 8, 2023 · Artificial Intelligence

Two‑Stage Constrained Actor‑Critic for Short‑Video Recommendation and a Reinforcement‑Learning Multi‑Task Recommendation Framework

The presentation introduces a two‑stage constrained actor‑critic algorithm that learns auxiliary policies for interaction signals before optimizing watch‑time under KL constraints, and a reinforcement‑learning multi‑task learning framework that models session‑level dynamics with adaptive multi‑critic weighting, both achieving significant offline and online gains in short‑video recommendation.

Recommendation SystemsReinforcement Learningactor-critic
0 likes · 16 min read
Two‑Stage Constrained Actor‑Critic for Short‑Video Recommendation and a Reinforcement‑Learning Multi‑Task Recommendation Framework
DataFunTalk
DataFunTalk
Nov 6, 2023 · Artificial Intelligence

Two‑Stage Constrained Actor‑Critic Reinforcement Learning for Short‑Video Recommendation and a Multi‑Task RL Framework

This article presents a two‑stage constrained actor‑critic reinforcement learning algorithm for short‑video recommendation, models the problem as a constrained MDP, details the algorithm’s stages, and reports extensive offline and online experiments showing superior watch‑time and interaction metrics, followed by a multi‑task RL framework and its evaluations.

Recommendation SystemsReinforcement Learningconstrained optimization
0 likes · 16 min read
Two‑Stage Constrained Actor‑Critic Reinforcement Learning for Short‑Video Recommendation and a Multi‑Task RL Framework
Alimama Tech
Alimama Tech
Oct 11, 2023 · Artificial Intelligence

How Minimax Regret Optimization Tackles Black‑Box Adversarial Bidding Constraints

This article explains how the Alibaba‑Mama team addresses constrained ROI bidding in a black‑box adversarial environment by introducing a Minimax Regret Optimization framework that aligns training and test distributions, builds a causal world model, and demonstrates robust performance on synthetic and real‑world ad auctions.

Reinforcement Learningadversarial biddingconstrained optimization
0 likes · 14 min read
How Minimax Regret Optimization Tackles Black‑Box Adversarial Bidding Constraints
Kuaishou Tech
Kuaishou Tech
Apr 27, 2023 · Artificial Intelligence

Two-Stage Constrained Actor‑Critic (TSCAC) for Short‑Video Recommendation

The paper models short‑video recommendation as a constrained Markov decision process and introduces a two‑stage constrained actor‑critic algorithm that jointly maximizes watch time while satisfying multiple interaction constraints, demonstrating superior offline and online performance on the KuaiRand dataset and Kuaishou app.

Reinforcement Learningactor-criticconstrained optimization
0 likes · 7 min read
Two-Stage Constrained Actor‑Critic (TSCAC) for Short‑Video Recommendation
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Oct 19, 2022 · Artificial Intelligence

Modeling and Optimizing Real‑Time Bidding for Xiaohongshu "Fries" Advertising

Xiaohongshu’s commercial team modeled the real‑time bidding process for its “Fries” ad product, derived an optimal linear‑programming bid formula, and implemented a simple two‑parameter PID‑controlled scheme that meets client pacing, delivery guarantees, and platform profit goals while using practical heuristics.

advertising optimizationalgorithmic strategyconstrained optimization
0 likes · 12 min read
Modeling and Optimizing Real‑Time Bidding for Xiaohongshu "Fries" Advertising
Alimama Tech
Alimama Tech
Sep 29, 2021 · Artificial Intelligence

Unified Solution to Constrained Bidding in Online Display Advertising (USCB)

The paper proposes a unified solution for real‑time bidding in online display ads that formulates advertiser budget and KPI limits as a constrained linear program, derives a closed‑form optimal bidding function with m+1 parameters, and uses model‑free reinforcement learning to dynamically adjust those parameters, achieving superior traffic‑value capture in large‑scale deployment on Alibaba’s Taobao platform.

Parameter TuningReinforcement Learningconstrained optimization
0 likes · 11 min read
Unified Solution to Constrained Bidding in Online Display Advertising (USCB)